Gentoo Archives: gentoo-portage-dev

From: Kent Fredric <kentfredric@×××××.com>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Date: Sat, 27 Jun 2020 08:31:50
Message-Id: CAATnKFCf+gBE+LdwrtNxz7Nz=eoptFageyDSopeVa4KTFkS0cA@mail.gmail.com
In Reply to: Re: [gentoo-portage-dev] Add caching to a few commonly used functions by Fabian Groffen
1 On Sat, 27 Jun 2020 at 19:35, Fabian Groffen <grobian@g.o> wrote:
2 >
3 > Hi Chun-Yu,
4
5 > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first
6 > > two cases, it was simple to cache the results in dicts, while
7 > > match_from_list was a bit trickier, since it seems to be a requirement
8 > > that it return actual entries from the input "candidate_list". I also
9 > > ran into some test failures if I did the caching after the
10 > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
11 > > function, so the caching is only done up to just before that point.
12
13 You may also want to investigate the version aspect parsing logic
14 where it converts versions into a data structure, partly because the
15 last time I tried profiling portage, every sample seemed to turn up in
16 there.
17
18 And I'd expect to see a lot of commonality in this.
19
20
21 # qlist -I --format "%{PV}" | wc -c
22 14678
23 # qlist -I --format "%{PV}" | sort -u | wc -c
24 8811
25
26 And given this version-parsing path is even handled for stuff *not*
27 installed, I suspect the real-world implications are worse
28
29 # find /usr/portage/ -name "*.ebuild" | sed
30 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
31 "%{PV}" | wc -l
32 32604
33 # find /usr/portage/ -name "*.ebuild" | sed
34 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
35 "%{PVR}" | sort -u | wc -l
36 10362
37 katipo2 ~ # find /usr/portage/ -name "*.ebuild" | sed
38 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
39 "%{PV}" | sort -u | wc -l
40 7515
41
42 Obviously this is very crude analysis, but you see there's room to
43 potentially no-op half of all version parses. Though the speed/memory
44 tradeoff may not be worth it.
45
46 Note, that this is not just "parse the version on the ebuild", which
47 is fast, but my sampling seemed to indicate it was parsing the version
48 afresh for every version comparison, which means internally, it was
49 parsing the same version dozens of times over, which is much slower!
50
51
52
53
54 --
55 Kent
56
57 KENTNL - https://metacpan.org/author/KENTNL