1 |
Tom Wijsman posted on Sun, 16 Jun 2013 23:24:27 +0200 as excerpted: |
2 |
|
3 |
> On Sun, 16 Jun 2013 19:33:53 +0000 (UTC) |
4 |
> Duncan <1i5t5.duncan@×××.net> wrote: |
5 |
> |
6 |
>> TL;DR: SSDs help. =:^) |
7 |
> |
8 |
> TL;DR: SSDs help, but they don't solve the underlying problem. =:-( |
9 |
|
10 |
Well, there's the long-term fix to the underlying problem, and there's |
11 |
coping strategies to help with where things are at now. I was simply |
12 |
saying that an SSD helps a LOT in dealing with the inefficiencies of the |
13 |
current code. See the "quite apart... practical question of ... dealing |
14 |
with the problem /now/" bit quoted below. |
15 |
|
16 |
> I have one; it's great to help make my boot short, but it isn't really a |
17 |
> great improvement for the Portage tree. Better I/O isn't a solution to |
18 |
> computational complexity; it doesn't deal with the CPU bottleneck. |
19 |
|
20 |
But here, agreed with ciaranm, the cpu's not the bottleneck, at least not |
21 |
from cold-cache. It doesn't even up the cpu clocking from minimum as |
22 |
it's mostly filesystem access. Once the cache is warm, then yes, it ups |
23 |
the CPU speed and I see the single-core behavior you mention, but cold- |
24 |
cache, no way; it's I/O bound. |
25 |
|
26 |
And with an ssd, the portage tree update (the syncs both of gentoo and |
27 |
the overlays) went from a /crawling/ console scroll, to scrolling so fast |
28 |
I can't read it. |
29 |
|
30 |
>> Quite apart from the theory and question of making the existing code |
31 |
>> faster vs. a new from-scratch implementation, there's the practical |
32 |
>> question of what options one can actually use to deal with the problem |
33 |
>> /now/. |
34 |
> |
35 |
> Don't rush it: Do you know the problem well? Does the solution properly |
36 |
> deal with it? Is it still usable some months / years from now? |
37 |
|
38 |
Not necessarily. But first we must /get/ to some months / years from |
39 |
now, and that's a lot easier if the best is made of the current |
40 |
situation, while a long term fix is being developed. |
41 |
|
42 |
>> FWIW, one solution (particularly for folks who don't claim to have |
43 |
>> reasonable coding skills and thus have limited options in that regard) |
44 |
>> is to throw hardware at the problem. |
45 |
> |
46 |
> Improvements in algorithmic complexity (exponential) are much bigger |
47 |
> than improvements you can achieve by buying new hardware (linear). |
48 |
|
49 |
Same song different verse. Fixing the algorithmic complexity is fine and |
50 |
certainly a good idea longer term, but it's not something I can use at my |
51 |
next update. Throwing hardware at the problem is usable now. |
52 |
|
53 |
>> --- |
54 |
>> [1] I'm running ntp and the initial ntp-client connection and time sync |
55 |
>> takes ~12 seconds a lot of the time, just over the initial 10 seconds |
56 |
>> down, 50 to go, trigger on openrc's 1-minute timeout. |
57 |
> |
58 |
> Why do you make your boot wait for NTP to sync its time? |
59 |
|
60 |
Well, ntpd is waiting for the initial step so it doesn't have to slew so |
61 |
hard for so long if the clock's multiple seconds off. |
62 |
|
63 |
And ntpd is in my default runlevel, with a few local service tasks that |
64 |
are after * and need a good clock time anyway, so... |
65 |
|
66 |
> How could hardware make this time sync go any faster? |
67 |
|
68 |
Which is what I said, that as a practical matter, my boot didn't speed up |
69 |
much /because/ I'm running (and waiting for) the ntp-client time- |
70 |
stepper. Thus, I'd not /expect/ a hardware update (unless it's to a more |
71 |
direct net connection) to help much. |
72 |
|
73 |
>> [2] ... SNIP ... runs ~1 hour ... SNIP ... |
74 |
> |
75 |
> Sounds great, but the same thing could run in much less time. I have |
76 |
> worse hardware, and it doesn't take much longer than yours do; so, I |
77 |
> don't really see the benefits new hardware bring to the table. And that |
78 |
> HDD to SSD change, that's really a once in a lifetime flood. |
79 |
|
80 |
I expect I'm more particular than most about checking changelogs. I |
81 |
certainly don't read them all, but if there's a revision-bump for |
82 |
instance, I like to see what the gentoo devs considered important enough |
83 |
to do a revision bump. And I religiously check portage logs, selecting |
84 |
mentioned bug numbers probably about half the time, which pops up a menu |
85 |
with a gentoo bug search on the number, from which I check the bug |
86 |
details and sometimes the actual git commit code. For all my overlays I |
87 |
check the git whatchanged logs, and I have a helper script that lets me |
88 |
fetch and then check git whatchanged for a number of my live packages, |
89 |
including openrc (where I switched to live-git precisely /because/ I was |
90 |
following it closely enough to find the git whatchanged logs useful, both |
91 |
for general information and for troubleshooting when something went wrong |
92 |
-- release versions simply didn't have enough resolution, too many things |
93 |
changing in each openrc release to easily track down problems and file |
94 |
bugs as appropriate), as well. |
95 |
|
96 |
And you're probably not rebuilding well over a hundred live-packages |
97 |
(thank $DEITY and the devs in question for ccache!) at every update, in |
98 |
addition to the usual (deep) @world version-bump and newuse updates, are |
99 |
you? |
100 |
|
101 |
Of course maybe you are, but I did specify that, and I didn't see |
102 |
anything in your comments indicating anything like an apples to apples |
103 |
comparision. |
104 |
|
105 |
>> [3] Also relevant, 16 gigs RAM, PORTAGETMPDIR on tmpfs. |
106 |
> |
107 |
> Sounds all cool, but think about your CPU again; saturate it... |
108 |
> |
109 |
> Building the Linux kernel with `make -j32 -l8` versus `make -j8` is a |
110 |
> huge difference; most people follow the latter instructions, without |
111 |
> really thinking through what actually happens with the underlying data. |
112 |
> The former queues up jobs for your processor; so the moment a job is |
113 |
> done a new job will be ready, so, you don't need to wait on the disk. |
114 |
|
115 |
Truth is, I used to run a plain make -j (no number and no -l at all) on |
116 |
my kernel builds, just to watch the system stress and then so elegantly |
117 |
recover. It's an amazing thing to watch, this Linux kernel thing and how |
118 |
it deals with cpu oversaturation. =:^) |
119 |
|
120 |
But I suppose I've gotten more conservative in my old age. =:^P |
121 |
Needlessly oversaturating the CPU (and RAM) only slows things down and |
122 |
forces cache dump and swappage. These days according to my kernel-build- |
123 |
script configuration I only run -j24, which seems a reasonable balance as |
124 |
it keeps the CPUs busy but stays safely enough within a few gigs of RAM |
125 |
so I don't dump-cache or hit swap. Timing a kernel build from make clean |
126 |
suggests it's the same sub-seconds range from -j10 or so, up to (from |
127 |
memory) -j50 or so, after which build time starts to go up, not down. |
128 |
|
129 |
> Something completely different; look at the history of data mining, |
130 |
> today's algorithms are much much faster than those of years ago. |
131 |
> |
132 |
> Just to point out that different implementations and configurations have |
133 |
> much more power in cutting time than the typical hardware change does. |
134 |
|
135 |
I agree and am not arguing that. All I'm saying is that there are |
136 |
measures that a sysadmin can take today to at least help work around the |
137 |
problem, today, while all those faster algorithms are being developed, |
138 |
implemented, tested and deployed. =:^) |
139 |
|
140 |
-- |
141 |
Duncan - List replies preferred. No HTML msgs. |
142 |
"Every nonfree program has a lord, a master -- |
143 |
and if you use the program, he is your master." Richard Stallman |