Gentoo Archives: gentoo-portage-dev

From: Chun-Yu Shei <cshei@××××××.com>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
Date: Thu, 09 Jul 2020 07:03:39
Message-Id: 20200709070330.555640-1-cshei@google.com
In Reply to: Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function by Zac Medico
1 Awesome! Here's a patch that adds @lru_cache to use_reduce, vercmp, and
2 catpkgsplit. use_reduce was split into 2 functions, with the outer one
3 converting lists/sets to tuples so they can be hashed and creating a
4 copy of the returned list (since the caller seems to modify it
5 sometimes). I tried to select cache sizes that minimized memory use increase,
6 while still providing about the same speedup compared to a cache with
7 unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases
8 from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the
9 RES column in htop increases from 280 MB -> 290 MB.
10
11 "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while
12 max observed RES value actually decreases from 228 MB -> 214 MB (similar
13 values observed across a few before/after runs).
14
15 Here are the cache hit stats, max observed RES memory, and runtime in
16 seconds for various sizes in the update case. Caching for each
17 function was tested independently (only 1 function with caching enabled
18 at a time):
19
20 catpkgsplit:
21 CacheInfo(hits=1222233, misses=21419, maxsize=None, currsize=21419)
22 270 MB
23 39.217
24
25 CacheInfo(hits=1218900, misses=24905, maxsize=10000, currsize=10000)
26 271 MB
27 39.112
28
29 CacheInfo(hits=1212675, misses=31022, maxsize=5000, currsize=5000)
30 271 MB
31 39.217
32
33 CacheInfo(hits=1207879, misses=35878, maxsize=2500, currsize=2500)
34 269 MB
35 39.438
36
37 CacheInfo(hits=1199402, misses=44250, maxsize=1000, currsize=1000)
38 271 MB
39 39.348
40
41 CacheInfo(hits=1149150, misses=94610, maxsize=100, currsize=100)
42 271 MB
43 39.487
44
45
46 use_reduce:
47 CacheInfo(hits=45326, misses=18660, maxsize=None, currsize=18561)
48 407 MB
49 35.77
50
51 CacheInfo(hits=45186, misses=18800, maxsize=10000, currsize=10000)
52 353 MB
53 35.52
54
55 CacheInfo(hits=44977, misses=19009, maxsize=5000, currsize=5000)
56 335 MB
57 35.31
58
59 CacheInfo(hits=44691, misses=19295, maxsize=2500, currsize=2500)
60 318 MB
61 35.85
62
63 CacheInfo(hits=44178, misses=19808, maxsize=1000, currsize=1000)
64 301 MB
65 36.39
66
67 CacheInfo(hits=41211, misses=22775, maxsize=100, currsize=100)
68 299 MB
69 37.175
70
71
72 I didn't bother collecting detailed stats for vercmp, since the
73 inputs/outputs are quite small and don't cause much memory increase.
74 Please let me know if there are any other suggestions/improvements (and
75 thanks Sid for the lru_cache suggestion!).
76
77 Thanks,
78 Chun-Yu

Replies