1 |
Hi, |
2 |
|
3 |
On Wed, 19 Nov 2014 22:59:05 +0300 Andrew Savchenko wrote: |
4 |
> On Mon, 17 Nov 2014 21:55:48 -0800 Zac Medico wrote: |
5 |
[...] |
6 |
> > > When I'll manage to run emerge -DNupv @world without errors, I'll |
7 |
> > > send you stats for both runs with and without dynamic deps. |
8 |
> > |
9 |
> > Great, hopefully that will reveal some more good things to optimize. |
10 |
> > |
11 |
> > > By the way, do you need pstats files (e.g. for some extra data) or |
12 |
> > > pdf graphs are sufficient? |
13 |
> > |
14 |
> > The pdf graphs are typically enough for me, since they highlight the hot |
15 |
> > spots really well. I did not even bother with your pstats files. |
16 |
> |
17 |
> OK. I managed to run emerge -DNupv @world on desktop without |
18 |
> conflicts. What was done: |
19 |
> 1) fixpacgkages run |
20 |
> 2) portage is updated to use your patch from bug 529660 |
21 |
> |
22 |
> At this point performance boost was really great: from ~35 |
23 |
> minutes to ~19-20 minutes. |
24 |
> |
25 |
> Afterward I tried emerge -DNupv @world with different python |
26 |
> versions: |
27 |
> (2.7) (~)2.7.8 |
28 |
> (3.3) 3.3.5-r1 |
29 |
> (3.4) (~)3.4.2 |
30 |
> |
31 |
> Results are interesting (confidence level for error is 0.95, time |
32 |
> real value was used for calculations): |
33 |
> 3.3 is 3% ± 5% faster than 2.7 |
34 |
> 3.4 is 20% ± 5% faster than 3.3 |
35 |
> And with python:3.4 and steps above it takes now 15.5 minutes |
36 |
> instead of 35. Nice result :) |
37 |
> |
38 |
> So there is no evidence that portage on 3.3 is faster than on 2.7, |
39 |
> but 3.4 is faster than 3.3 with very good confidence. Of course |
40 |
> this data is biased by -m cProfile overhead, but bias should |
41 |
> similar for each version. Just checked time to run command for |
42 |
> python:3.4 without profiling: it takes 11.5 minutes! |
43 |
> |
44 |
> You may find generated pdf graphs together with system information |
45 |
> for each host here: |
46 |
> ftp://brunestud.info/gentoo/portage-v2.tar.xz |
47 |
> |
48 |
> As for hitomi box, it is both slower and have much older packages, |
49 |
> so I'm still struggling to fix conflicts and other issues. Results |
50 |
> will be available later. |
51 |
|
52 |
I managed to get data from hitomi too, see: |
53 |
ftp://brunestud.info/gentoo/portage-v3.tar.xz |
54 |
(this archive also contains all graphs previously obtained) |
55 |
|
56 |
Graphs were obtained the same way as on desktop. |
57 |
Portage and python versions are the same, time information follows |
58 |
for _profiled_ runs: |
59 |
|
60 |
> (2.7) (~)2.7.8 |
61 |
real 55m19.892s |
62 |
user 39m11.913s |
63 |
sys 15m37.586s |
64 |
|
65 |
> (3.3) 3.3.5-r1 |
66 |
real 52m34.640s |
67 |
user 36m45.325s |
68 |
sys 15m25.663s |
69 |
|
70 |
> (3.4) (~)3.4.2 |
71 |
real 53m32.657s |
72 |
user 37m12.369s |
73 |
sys 15m52.641s |
74 |
|
75 |
Without profiling using 3.3.5-r1: |
76 |
real 25m50.439s |
77 |
user 25m28.260s |
78 |
sys 0m7.863s |
79 |
|
80 |
This is quite surprising. On hitomi (Intel Atom N270) there is no |
81 |
difference between 3.3 and 3.4, but both are slightly better than |
82 |
2.7. (To exclude possible cache issues I made a blank run before |
83 |
first test run.) Probably some arch-dependent optimizations. |
84 |
|
85 |
What surprises me most is that profiling overhead is huge (~105%) |
86 |
compared to overhead on desktop (~35%). CPU speeds are not that |
87 |
different, instruction sets too (Atom has sse2, sse3, ssse3, but |
88 |
lacks 3dnow, 3dnowext). L2 cache is the same (512КB), but L1 |
89 |
differs significantly: 64 KB data/64 KB instruction cache vs 24 KB |
90 |
data/32 KB instruction cache. Look like this is the reason. |
91 |
|
92 |
Best regards, |
93 |
Andrew Savchenko |