1 |
On 6/27/20 8:12 PM, Michał Górny wrote: |
2 |
> Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico <zmedico@g.o> napisał(a): |
3 |
>> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: |
4 |
>>> Hi, |
5 |
>>> |
6 |
>>> I was recently interested in whether portage could be speed up, since |
7 |
>>> dependency resolution can sometimes take a while on slower machines. |
8 |
>>> After generating some flame graphs with cProfile and vmprof, I found |
9 |
>> 3 |
10 |
>>> functions which seem to be called extremely frequently with the same |
11 |
>>> arguments: catpkgsplit, use_reduce, and match_from_list. In the |
12 |
>> first |
13 |
>>> two cases, it was simple to cache the results in dicts, while |
14 |
>>> match_from_list was a bit trickier, since it seems to be a |
15 |
>> requirement |
16 |
>>> that it return actual entries from the input "candidate_list". I |
17 |
>> also |
18 |
>>> ran into some test failures if I did the caching after the |
19 |
>>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of |
20 |
>> the |
21 |
>>> function, so the caching is only done up to just before that point. |
22 |
>>> |
23 |
>>> The catpkgsplit change seems to definitely be safe, and I'm pretty |
24 |
>> sure |
25 |
>>> the use_reduce one is too, since anything that could possibly change |
26 |
>> the |
27 |
>>> result is hashed. I'm a bit less certain about the match_from_list |
28 |
>> one, |
29 |
>>> although all tests are passing. |
30 |
>>> |
31 |
>>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" |
32 |
>>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. |
33 |
>> "emerge |
34 |
>>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec |
35 |
>>> (2.5% improvement). Since the upgrade case is far more common, this |
36 |
>>> would really help in daily use, and it shaves about 30 seconds off |
37 |
>>> the time you have to wait to get to the [Yes/No] prompt (from ~90s to |
38 |
>>> 60s) on my old Sandy Bridge laptop when performing normal upgrades. |
39 |
>>> |
40 |
>>> Hopefully, at least some of these patches can be incorporated, and |
41 |
>> please |
42 |
>>> let me know if any changes are necessary. |
43 |
>>> |
44 |
>>> Thanks, |
45 |
>>> Chun-Yu |
46 |
>> |
47 |
>> Using global variables for caches like these causes a form of memory |
48 |
>> leak for use cases involving long-running processes that need to work |
49 |
>> with many different repositories (and perhaps multiple versions of |
50 |
>> those |
51 |
>> repositories). |
52 |
>> |
53 |
>> There are at least a couple of different strategies that we can use to |
54 |
>> avoid this form of memory leak: |
55 |
>> |
56 |
>> 1) Limit the scope of the caches so that they have some sort of garbage |
57 |
>> collection life cycle. For example, it would be natural for the |
58 |
>> depgraph |
59 |
>> class to have a local cache of use_reduce results, so that the cache |
60 |
>> can |
61 |
>> be garbage collected along with the depgraph. |
62 |
>> |
63 |
>> 2) Eliminate redundant calls. For example, redundant calls to |
64 |
>> catpkgslit |
65 |
>> can be avoided by constructing more _pkg_str instances, since |
66 |
>> catpkgsplit is able to return early when its argument happens to be a |
67 |
>> _pkg_str instance. |
68 |
> |
69 |
> I think the weak stuff from the standard library might also be helpful. |
70 |
> |
71 |
> -- |
72 |
> Best regards, |
73 |
> Michał Górny |
74 |
> |
75 |
|
76 |
Hmm, maybe weak global caches are an option? |
77 |
-- |
78 |
Thanks, |
79 |
Zac |