1 |
On Mon, 24 Jun 2013 15:27:19 +0000 (UTC) |
2 |
Duncan <1i5t5.duncan@×××.net> wrote: |
3 |
|
4 |
> > I have one; it's great to help make my boot short, but it isn't |
5 |
> > really a great improvement for the Portage tree. Better I/O isn't a |
6 |
> > solution to computational complexity; it doesn't deal with the CPU |
7 |
> > bottleneck. |
8 |
> |
9 |
> But here, agreed with ciaranm, the cpu's not the bottleneck, at least |
10 |
> not from cold-cache. It doesn't even up the cpu clocking from |
11 |
> minimum as it's mostly filesystem access. Once the cache is warm, |
12 |
> then yes, it ups the CPU speed and I see the single-core behavior you |
13 |
> mention, but cold- cache, no way; it's I/O bound. |
14 |
> |
15 |
> And with an ssd, the portage tree update (the syncs both of gentoo |
16 |
> and the overlays) went from a /crawling/ console scroll, to scrolling |
17 |
> so fast I can't read it. |
18 |
|
19 |
We're not talking about the Portage tree update, but about the |
20 |
dependency tree generation, which relies much more on the CPU than I/O. |
21 |
A lot of loops inside loops inside loops, comparisons and more data |
22 |
structure magic is going on; if this were optimized to be of a lower |
23 |
complexity or be processed by multiple cores, this would speed up a lot. |
24 |
|
25 |
Take a look at the profiler image and try to get a quick understanding |
26 |
of the code; after following a few function calls, it will become clear. |
27 |
|
28 |
Granted, I/O is still a part of the problem which is why I think caches |
29 |
would help too; but from what I see the time / space complexity is just |
30 |
too high, so you don't even have to deem this as CPU or I/O bound... |
31 |
|
32 |
> >> Quite apart from the theory and question of making the existing |
33 |
> >> code faster vs. a new from-scratch implementation, there's the |
34 |
> >> practical question of what options one can actually use to deal |
35 |
> >> with the problem /now/. |
36 |
> > |
37 |
> > Don't rush it: Do you know the problem well? Does the solution |
38 |
> > properly deal with it? Is it still usable some months / years from |
39 |
> > now? |
40 |
> |
41 |
> Not necessarily. But first we must /get/ to some months / years from |
42 |
> now, and that's a lot easier if the best is made of the current |
43 |
> situation, while a long term fix is being developed. |
44 |
|
45 |
True, we have make and use the most out of Portage as long as possible. |
46 |
|
47 |
> >> FWIW, one solution (particularly for folks who don't claim to have |
48 |
> >> reasonable coding skills and thus have limited options in that |
49 |
> >> regard) is to throw hardware at the problem. |
50 |
> > |
51 |
> > Improvements in algorithmic complexity (exponential) are much bigger |
52 |
> > than improvements you can achieve by buying new hardware (linear). |
53 |
> |
54 |
> Same song different verse. Fixing the algorithmic complexity is fine |
55 |
> and certainly a good idea longer term, but it's not something I can |
56 |
> use at my next update. Throwing hardware at the problem is usable |
57 |
> now. |
58 |
|
59 |
If you have the money; yes, that's an option. |
60 |
|
61 |
Though I think a lot of people see Linux as something you don't need to |
62 |
throw a lot of money at; it should run on low end systems, and that's |
63 |
kind of the type of users we shouldn't just neglect going forward. |
64 |
|
65 |
> >> [2] ... SNIP ... runs ~1 hour ... SNIP ... |
66 |
> > |
67 |
> > Sounds great, but the same thing could run in much less time. I have |
68 |
> > worse hardware, and it doesn't take much longer than yours do; so, I |
69 |
> > don't really see the benefits new hardware bring to the table. And |
70 |
> > that HDD to SSD change, that's really a once in a lifetime flood. |
71 |
> |
72 |
> I expect I'm more particular than most about checking changelogs. I |
73 |
> certainly don't read them all, but if there's a revision-bump for |
74 |
> instance, I like to see what the gentoo devs considered important |
75 |
> enough to do a revision bump. And I religiously check portage logs, |
76 |
> selecting mentioned bug numbers probably about half the time, which |
77 |
> pops up a menu with a gentoo bug search on the number, from which I |
78 |
> check the bug details and sometimes the actual git commit code. For |
79 |
> all my overlays I check the git whatchanged logs, and I have a helper |
80 |
> script that lets me fetch and then check git whatchanged for a number |
81 |
> of my live packages, including openrc (where I switched to live-git |
82 |
> precisely /because/ I was following it closely enough to find the git |
83 |
> whatchanged logs useful, both for general information and for |
84 |
> troubleshooting when something went wrong -- release versions simply |
85 |
> didn't have enough resolution, too many things changing in each |
86 |
> openrc release to easily track down problems and file bugs as |
87 |
> appropriate), as well. |
88 |
|
89 |
I stick more to releases and checking the changes for things where I |
90 |
want to know the changes for; for the others, they either don't matter |
91 |
or they shouldn't really hurt as a surprise. If there's something that |
92 |
would really surprise me then I'd expect some news on that. |
93 |
|
94 |
> And you're probably not rebuilding well over a hundred live-packages |
95 |
> (thank $DEITY and the devs in question for ccache!) at every update, |
96 |
> in addition to the usual (deep) @world version-bump and newuse |
97 |
> updates, are you? |
98 |
|
99 |
Developers rebuild those to see upcoming breakage. |
100 |
|
101 |
Apart from that, I don't use many -9999 as to not go too unstable. |
102 |
|
103 |
> >> [3] Also relevant, 16 gigs RAM, PORTAGETMPDIR on tmpfs. |
104 |
> > |
105 |
> > Sounds all cool, but think about your CPU again; saturate it... |
106 |
> > |
107 |
> > Building the Linux kernel with `make -j32 -l8` versus `make -j8` is |
108 |
> > a huge difference; most people follow the latter instructions, |
109 |
> > without really thinking through what actually happens with the |
110 |
> > underlying data. The former queues up jobs for your processor; so |
111 |
> > the moment a job is done a new job will be ready, so, you don't |
112 |
> > need to wait on the disk. |
113 |
> |
114 |
> Truth is, I used to run a plain make -j (no number and no -l at all) |
115 |
> on my kernel builds, just to watch the system stress and then so |
116 |
> elegantly recover. It's an amazing thing to watch, this Linux kernel |
117 |
> thing and how it deals with cpu oversaturation. =:^) |
118 |
|
119 |
If you have the memory to pull it off, which involves money again. |
120 |
|
121 |
> But I suppose I've gotten more conservative in my old age. =:^P |
122 |
|
123 |
> Needlessly oversaturating the CPU (and RAM) only slows things down |
124 |
> and forces cache dump and swappage. |
125 |
|
126 |
The trick is to set it a bit before the point of oversaturating; low |
127 |
enough so most packages don't oversaturize, it could be put more |
128 |
precisely for every package but that time is better spent elsewhere |
129 |
|
130 |
> > Something completely different; look at the history of data mining, |
131 |
> > today's algorithms are much much faster than those of years ago. |
132 |
> > |
133 |
> > Just to point out that different implementations and configurations |
134 |
> > have much more power in cutting time than the typical hardware |
135 |
> > change does. |
136 |
> |
137 |
> I agree and am not arguing that. All I'm saying is that there are |
138 |
> measures that a sysadmin can take today to at least help work around |
139 |
> the problem, today, while all those faster algorithms are being |
140 |
> developed, implemented, tested and deployed. =:^) |
141 |
|
142 |
Not everyone is a sysadmin with a server; I'm just a student running a |
143 |
laptop bought some years ago, and I'm kind of the type that doesn't |
144 |
replace it while it still works fine otherwise. Maybe when I graduate... |
145 |
|
146 |
I think we can both agree a faster system does a better job at it; but |
147 |
they won't deal with crux of the problem, the algorithmic complexity. |
148 |
|
149 |
Dealing with both, as you mention, is the real deal. |
150 |
|
151 |
-- |
152 |
With kind regards, |
153 |
|
154 |
Tom Wijsman (TomWij) |
155 |
Gentoo Developer |
156 |
|
157 |
E-mail address : TomWij@g.o |
158 |
GPG Public Key : 6D34E57D |
159 |
GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D |