Gentoo Archives: gentoo-portage-dev

From: Ed W <lists@××××××××××.com>
To: gentoo-portage-dev@l.g.o
Subject: [gentoo-portage-dev] Performance tuning and parallelisation
Date: Thu, 26 Aug 2021 11:03:26
Message-Id: eaa5c412-c3e8-6787-c62e-2a0fccffbb37@wildgooses.com
1 Hi All
2
3 Consider this a tentative first email to test the water, but I have started to look at performance
4 of particularly the install phase of the emerge utility and I could use some guidance on where to go
5 next
6
7 Firstly, to define the "problem": I have found gentoo to be a great base for building custom
8 distributions and I use it to build a small embedded distro which runs on a couple of different
9 architectures. (Essentially just a "ROOT=/something emerge $some_packages"). However, I use some
10 packaging around binpackages to avoid uncessary rebuilds, and this highlights that "building" a
11 complete install using only binary packages rarely gets over a load of 1. Can we do better than
12 this? Seems to be highly serialised on the install phase of copying the files to the disk?
13
14 (Note I use parallel build and parallel-install flags, plus --jobs=N. If there is code to compile
15 then load will shoot up, but simply installing binpackages struggles to get the load over about
16 0.7-1.1, so presumably single threaded in all parts?)
17
18
19 Now, this is particularly noticeable where I cheated to build my arm install and just used qemu
20 user-mode on an amd64 host (rather than using cross-compile). Here it's very noticeable that the
21 install/merge phase of the build is consuming much/most of the install time. 
22
23 eg, random example (under qemu user mode)
24
25 # time ROOT=/tmp/timetest emerge -1k --nodeps openssl
26
27 >>> Emerging binary (1 of 1) dev-libs/openssl-1.1.1k-r1::gentoo for /tmp/timetest/
28 ...
29 real    0m30.145s
30 user    0m29.066s
31 sys    0m1.685s
32
33
34 Running the same on the native host is about 5-6sec, (and I find this ratio fairly consistent for
35 qemu usermode, about 5-6x slower than native)
36
37 If I pick another package with fewer files, then I will see this 5-6 secs drop, suggesting (without
38 offering proof) that the bulk of the time here is some "per file" processing.
39
40 Note this machine is a 12 core AMD ryzen 3900x with SSDs that bench around the 4GB/s+. So really 5-6
41 seconds to install a few files is relatively "slow". Random benchmark on this machine might be that
42 I can backup 4.5GB of chroot with tar+zstd in about 4 seconds.
43
44
45 So the question is: I assume that further parallelisation of the install phase will be difficult,
46 therefore the low hanging fruit here seems to be the install/merge phase and why there seems to be
47 quite a bit of CPU "per file installed"? Can anyone give me a leg up on how I could benchmark this
48 further and look for the hotspot? Perhaps someone understand the architecture of this point more
49 intimately and could point at whether there are opportunities to do some of the processing on mass,
50 rather than per file?
51
52 I'm not really a python guru, but interested to poke further to see where the time is going.
53
54
55 Many thanks
56
57 Ed W

Replies

Subject Author
Re: [gentoo-portage-dev] Performance tuning and parallelisation Marco Sirabella <marco@×××××××××.org>
Re: [gentoo-portage-dev] Performance tuning and parallelisation Alec Warner <antarus@g.o>