[gentoo-dev] Re: Packages up for grabs - gentoo-dev

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-dev@l.g.o
Subject:	[gentoo-dev] Re: Packages up for grabs
Date:	Mon, 24 Jun 2013 15:28:00
Message-Id:	`pan$23f2d$9004945f$576ee996$b7e282de@cox.net`
In Reply to:	Re: [gentoo-dev] Re: Packages up for grabs by Tom Wijsman

1

Tom Wijsman posted on Sun, 16 Jun 2013 23:24:27 +0200 as excerpted:

2

3

> On Sun, 16 Jun 2013 19:33:53 +0000 (UTC)

4

> Duncan <1i5t5.duncan@×××.net> wrote:

5

>

6

>> TL;DR: SSDs help. =:^)

7

>

8

> TL;DR: SSDs help, but they don't solve the underlying problem. =:-(

9

10

Well, there's the long-term fix to the underlying problem, and there's 

11

coping strategies to help with where things are at now.  I was simply 

12

saying that an SSD helps a LOT in dealing with the inefficiencies of the 

13

current code.  See the "quite apart... practical question of ... dealing 

14

with the problem /now/" bit quoted below.

15

16

> I have one; it's great to help make my boot short, but it isn't really a

17

> great improvement for the Portage tree. Better I/O isn't a solution to

18

> computational complexity; it doesn't deal with the CPU bottleneck.

19

20

But here, agreed with ciaranm, the cpu's not the bottleneck, at least not 

21

from cold-cache.  It doesn't even up the cpu clocking from minimum as 

22

it's mostly filesystem access.  Once the cache is warm, then yes, it ups 

23

the CPU speed and I see the single-core behavior you mention, but cold-

24

cache, no way; it's I/O bound.

25

26

And with an ssd, the portage tree update (the syncs both of gentoo and 

27

the overlays) went from a /crawling/ console scroll, to scrolling so fast 

28

I can't read it.

29

30

>> Quite apart from the theory and question of making the existing code

31

>> faster vs. a new from-scratch implementation, there's the practical

32

>> question of what options one can actually use to deal with the problem

33

>> /now/.

34

>

35

> Don't rush it: Do you know the problem well? Does the solution properly

36

> deal with it? Is it still usable some months / years from now?

37

38

Not necessarily.  But first we must /get/ to some months / years from 

39

now, and that's a lot easier if the best is made of the current 

40

situation, while a long term fix is being developed.

41

42

>> FWIW, one solution (particularly for folks who don't claim to have

43

>> reasonable coding skills and thus have limited options in that regard)

44

>> is to throw hardware at the problem.

45

>

46

> Improvements in algorithmic complexity (exponential) are much bigger

47

> than improvements you can achieve by buying new hardware (linear).

48

49

Same song different verse.  Fixing the algorithmic complexity is fine and 

50

certainly a good idea longer term, but it's not something I can use at my 

51

next update.  Throwing hardware at the problem is usable now.

52

53

>> ---

54

>> [1] I'm running ntp and the initial ntp-client connection and time sync

55

>> takes ~12 seconds a lot of the time, just over the initial 10 seconds

56

>> down, 50 to go, trigger on openrc's 1-minute timeout.

57

>

58

> Why do you make your boot wait for NTP to sync its time?

59

60

Well, ntpd is waiting for the initial step so it doesn't have to slew so 

61

hard for so long if the clock's multiple seconds off.

62

63

And ntpd is in my default runlevel, with a few local service tasks that 

64

are after * and need a good clock time anyway, so...

65

66

> How could hardware make this time sync go any faster?

67

68

Which is what I said, that as a practical matter, my boot didn't speed up 

69

much /because/ I'm running (and waiting for) the ntp-client time-

70

stepper.  Thus, I'd not /expect/ a hardware update (unless it's to a more 

71

direct net connection) to help much.

72

73

>> [2] ... SNIP ... runs ~1 hour ... SNIP ...

74

>

75

> Sounds great, but the same thing could run in much less time. I have

76

> worse hardware, and it doesn't take much longer than yours do; so, I

77

> don't really see the benefits new hardware bring to the table. And that

78

> HDD to SSD change, that's really a once in a lifetime flood.

79

80

I expect I'm more particular than most about checking changelogs.  I 

81

certainly don't read them all, but if there's a revision-bump for 

82

instance, I like to see what the gentoo devs considered important enough 

83

to do a revision bump.  And I religiously check portage logs, selecting 

84

mentioned bug numbers probably about half the time, which pops up a menu 

85

with a gentoo bug search on the number, from which I check the bug 

86

details and sometimes the actual git commit code.  For all my overlays I 

87

check the git whatchanged logs, and I have a helper script that lets me 

88

fetch and then check git whatchanged for a number of my live packages, 

89

including openrc (where I switched to live-git precisely /because/ I was 

90

following it closely enough to find the git whatchanged logs useful, both 

91

for general information and for troubleshooting when something went wrong 

92

-- release versions simply didn't have enough resolution, too many things 

93

changing in each openrc release to easily track down problems and file 

94

bugs as appropriate), as well.

95

96

And you're probably not rebuilding well over a hundred live-packages 

97

(thank $DEITY and the devs in question for ccache!) at every update, in 

98

addition to the usual (deep) @world version-bump and newuse updates, are 

99

you?

100

101

Of course maybe you are, but I did specify that, and I didn't see 

102

anything in your comments indicating anything like an apples to apples 

103

comparision.

104

105

>> [3] Also relevant, 16 gigs RAM, PORTAGETMPDIR on tmpfs.

106

>

107

> Sounds all cool, but think about your CPU again; saturate it...

108

>

109

> Building the Linux kernel with `make -j32 -l8` versus `make -j8` is a

110

> huge difference; most people follow the latter instructions, without

111

> really thinking through what actually happens with the underlying data.

112

> The former queues up jobs for your processor; so the moment a job is

113

> done a new job will be ready, so, you don't need to wait on the disk.

114

115

Truth is, I used to run a plain make -j (no number and no -l at all) on 

116

my kernel builds, just to watch the system stress and then so elegantly 

117

recover.  It's an amazing thing to watch, this Linux kernel thing and how 

118

it deals with cpu oversaturation.  =:^)

119

120

But I suppose I've gotten more conservative in my old age. =:^P  

121

Needlessly oversaturating the CPU (and RAM) only slows things down and 

122

forces cache dump and swappage.  These days according to my kernel-build-

123

script configuration I only run -j24, which seems a reasonable balance as 

124

it keeps the CPUs busy but stays safely enough within a few gigs of RAM 

125

so I don't dump-cache or hit swap.  Timing a kernel build from make clean 

126

suggests it's the same sub-seconds range from -j10 or so, up to (from 

127

memory) -j50 or so, after which build time starts to go up, not down.

128

129

> Something completely different; look at the history of data mining,

130

> today's algorithms are much much faster than those of years ago.

131

>

132

> Just to point out that different implementations and configurations have

133

> much more power in cutting time than the typical hardware change does.

134

135

I agree and am not arguing that.  All I'm saying is that there are 

136

measures that a sysadmin can take today to at least help work around the 

137

problem, today, while all those faster algorithms are being developed, 

138

implemented, tested and deployed. =:^)

139

140

--

141

Duncan - List replies preferred.   No HTML msgs.

142

"Every nonfree program has a lord, a master --

143

and if you use the program, he is your master."  Richard Stallman

Gentoo Archives: gentoo-dev

Replies

1	Tom Wijsman posted on Sun, 16 Jun 2013 23:24:27 +0200 as excerpted:
2
3	> On Sun, 16 Jun 2013 19:33:53 +0000 (UTC)
4	> Duncan <1i5t5.duncan@×××.net> wrote:
5	>
6	>> TL;DR: SSDs help. =:^)
7	>
8	> TL;DR: SSDs help, but they don't solve the underlying problem. =:-(
9
10	Well, there's the long-term fix to the underlying problem, and there's
11	coping strategies to help with where things are at now. I was simply
12	saying that an SSD helps a LOT in dealing with the inefficiencies of the
13	current code. See the "quite apart... practical question of ... dealing
14	with the problem /now/" bit quoted below.
15
16	> I have one; it's great to help make my boot short, but it isn't really a
17	> great improvement for the Portage tree. Better I/O isn't a solution to
18	> computational complexity; it doesn't deal with the CPU bottleneck.
19
20	But here, agreed with ciaranm, the cpu's not the bottleneck, at least not
21	from cold-cache. It doesn't even up the cpu clocking from minimum as
22	it's mostly filesystem access. Once the cache is warm, then yes, it ups
23	the CPU speed and I see the single-core behavior you mention, but cold-
24	cache, no way; it's I/O bound.
25
26	And with an ssd, the portage tree update (the syncs both of gentoo and
27	the overlays) went from a /crawling/ console scroll, to scrolling so fast
28	I can't read it.
29
30	>> Quite apart from the theory and question of making the existing code
31	>> faster vs. a new from-scratch implementation, there's the practical
32	>> question of what options one can actually use to deal with the problem
33	>> /now/.
34	>
35	> Don't rush it: Do you know the problem well? Does the solution properly
36	> deal with it? Is it still usable some months / years from now?
37
38	Not necessarily. But first we must /get/ to some months / years from
39	now, and that's a lot easier if the best is made of the current
40	situation, while a long term fix is being developed.
41
42	>> FWIW, one solution (particularly for folks who don't claim to have
43	>> reasonable coding skills and thus have limited options in that regard)
44	>> is to throw hardware at the problem.
45	>
46	> Improvements in algorithmic complexity (exponential) are much bigger
47	> than improvements you can achieve by buying new hardware (linear).
48
49	Same song different verse. Fixing the algorithmic complexity is fine and
50	certainly a good idea longer term, but it's not something I can use at my
51	next update. Throwing hardware at the problem is usable now.
52
53	>> ---
54	>> [1] I'm running ntp and the initial ntp-client connection and time sync
55	>> takes ~12 seconds a lot of the time, just over the initial 10 seconds
56	>> down, 50 to go, trigger on openrc's 1-minute timeout.
57	>
58	> Why do you make your boot wait for NTP to sync its time?
59
60	Well, ntpd is waiting for the initial step so it doesn't have to slew so
61	hard for so long if the clock's multiple seconds off.
62
63	And ntpd is in my default runlevel, with a few local service tasks that
64	are after * and need a good clock time anyway, so...
65
66	> How could hardware make this time sync go any faster?
67
68	Which is what I said, that as a practical matter, my boot didn't speed up
69	much /because/ I'm running (and waiting for) the ntp-client time-
70	stepper. Thus, I'd not /expect/ a hardware update (unless it's to a more
71	direct net connection) to help much.
72
73	>> [2] ... SNIP ... runs ~1 hour ... SNIP ...
74	>
75	> Sounds great, but the same thing could run in much less time. I have
76	> worse hardware, and it doesn't take much longer than yours do; so, I
77	> don't really see the benefits new hardware bring to the table. And that
78	> HDD to SSD change, that's really a once in a lifetime flood.
79
80	I expect I'm more particular than most about checking changelogs. I
81	certainly don't read them all, but if there's a revision-bump for
82	instance, I like to see what the gentoo devs considered important enough
83	to do a revision bump. And I religiously check portage logs, selecting
84	mentioned bug numbers probably about half the time, which pops up a menu
85	with a gentoo bug search on the number, from which I check the bug
86	details and sometimes the actual git commit code. For all my overlays I
87	check the git whatchanged logs, and I have a helper script that lets me
88	fetch and then check git whatchanged for a number of my live packages,
89	including openrc (where I switched to live-git precisely /because/ I was
90	following it closely enough to find the git whatchanged logs useful, both
91	for general information and for troubleshooting when something went wrong
92	-- release versions simply didn't have enough resolution, too many things
93	changing in each openrc release to easily track down problems and file
94	bugs as appropriate), as well.
95
96	And you're probably not rebuilding well over a hundred live-packages
97	(thank $DEITY and the devs in question for ccache!) at every update, in
98	addition to the usual (deep) @world version-bump and newuse updates, are
99	you?
100
101	Of course maybe you are, but I did specify that, and I didn't see
102	anything in your comments indicating anything like an apples to apples
103	comparision.
104
105	>> [3] Also relevant, 16 gigs RAM, PORTAGETMPDIR on tmpfs.
106	>
107	> Sounds all cool, but think about your CPU again; saturate it...
108	>
109	> Building the Linux kernel with `make -j32 -l8` versus `make -j8` is a
110	> huge difference; most people follow the latter instructions, without
111	> really thinking through what actually happens with the underlying data.
112	> The former queues up jobs for your processor; so the moment a job is
113	> done a new job will be ready, so, you don't need to wait on the disk.
114
115	Truth is, I used to run a plain make -j (no number and no -l at all) on
116	my kernel builds, just to watch the system stress and then so elegantly
117	recover. It's an amazing thing to watch, this Linux kernel thing and how
118	it deals with cpu oversaturation. =:^)
119
120	But I suppose I've gotten more conservative in my old age. =:^P
121	Needlessly oversaturating the CPU (and RAM) only slows things down and
122	forces cache dump and swappage. These days according to my kernel-build-
123	script configuration I only run -j24, which seems a reasonable balance as
124	it keeps the CPUs busy but stays safely enough within a few gigs of RAM
125	so I don't dump-cache or hit swap. Timing a kernel build from make clean
126	suggests it's the same sub-seconds range from -j10 or so, up to (from
127	memory) -j50 or so, after which build time starts to go up, not down.
128
129	> Something completely different; look at the history of data mining,
130	> today's algorithms are much much faster than those of years ago.
131	>
132	> Just to point out that different implementations and configurations have
133	> much more power in cutting time than the typical hardware change does.
134
135	I agree and am not arguing that. All I'm saying is that there are
136	measures that a sysadmin can take today to at least help work around the
137	problem, today, while all those faster algorithms are being developed,
138	implemented, tested and deployed. =:^)
139
140	--
141	Duncan - List replies preferred. No HTML msgs.
142	"Every nonfree program has a lord, a master --
143	and if you use the program, he is your master." Richard Stallman