Gentoo Archives: gentoo-portage-dev

From: Chun-Yu Shei <cshei@××××××.com>
To: gentoo-portage-dev@l.g.o
Cc: Michael 'veremitz' Everitt <gentoo@×××××××.xyz>, Sid Spry <sid@××××.us>, Zac Medico <zmedico@g.o>
Subject: Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
Date: Mon, 06 Jul 2020 17:30:53
Message-Id: CAP=_c=0ZsWsmuEZgN8rHjpXQx+=GGT=DGUPgigcnYCURoK4B4w@mail.gmail.com
In Reply to: Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function by Francesco Riosa
1 I finally got a chance to try Sid's lru_cache suggestion, and the
2 results were really good. Simply adding it on catpkgsplit and moving
3 the body of use_reduce into a separate function (that accepts tuples
4 instead of unhashable lists/sets) and decorating it with lru_cache
5 gets a similar 40% overall speedup for the upgrade case I tested. It
6 seems like even a relatively small cache size (1000 entries) gives
7 quite a speedup, even though in the use_reduce case, the cache size
8 eventually reaches almost 20,000 entries if no limit is set. With
9 these two changes, adding caching to match_from_list didn't seem to
10 make much/any difference.
11
12 The catch is that lru_cache is only available in Python 3.2, so would
13 it make sense to add a dummy lru_cache implementation for Python < 3.2
14 that does nothing? There is also a backports-functools-lru-cache
15 package that's already available in the Portage tree, but that would
16 add an additional external dependency.
17
18 I agree that refactoring could yield an even bigger gain, but
19 hopefully this can be implemented as an interim solution to speed up
20 the common emerge case of resolving upgrades. I'm happy to submit new
21 patches for this, if someone can suggest how to best handle the Python
22 < 3.2 case. :)
23
24 Thanks,
25 Chun-Yu
26
27
28 On Mon, Jul 6, 2020 at 9:10 AM Francesco Riosa <vivo75@×××××.com> wrote:
29 >
30 > Il 06/07/20 17:50, Michael 'veremitz' Everitt ha scritto:
31 > > On 06/07/20 16:26, Francesco Riosa wrote:
32 > >> Il 29/06/20 03:58, Sid Spry ha scritto:
33 > >>> There are libraries that provide decorators, etc, for caching and
34 > >>> memoization.
35 > >>> Have you evaluated any of those? One is available in the standard library:
36 > >>> https://docs.python.org/dev/library/functools.html#functools.lru_cache
37 > >>>
38 > >>> I comment as this would increase code clarity.
39 > >>>
40 > >> I think portage developers try hard to avoid external dependancies
41 > >> I hope hard they do
42 > >>
43 > >>
44 > > I think the key word here is 'external' - anything which is part of the
45 > > python standard library is game for inclusion in portage, and has/does
46 > > provide much needed optimisation. Many of the issues in portage are
47 > > so-called "solved problems" in computing terms, and as such, we should take
48 > > advantage of these to improve performance at every available opportunity.
49 > > Of course, there are presently only one, two or three key developers able
50 > > to make/test these changes (indeed at scale) so progress is often slower
51 > > than desirable in current circumstances...
52 > >
53 > > [sent direct due to posting restrictions...]
54 > yes I've replied too fast and didn't notice Sid was referring to
55 > _standard_ libraries (not even recent additions)
56 >
57 > sorry for the noise
58 >
59 > - Francesco
60 >
61 >

Replies