Gentoo Archives: gentoo-user

From: Kai Krakow <hurikhan77@×××××.com>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Re: distributed emerge
Date: Wed, 27 Sep 2017 00:04:34
Message-Id: 20170927020412.71d6c0cc@jupiter.sol.kaishome.de
In Reply to: [gentoo-user] distributed emerge by Damo Brisbane
1 Am Mon, 25 Sep 2017 21:35:02 +1000
2 schrieb Damo Brisbane <dhatchett2@×××××.com>:
3
4 > Can someone point where I might go for parallel @world build, it is
5 > really for my own curiositynat this time. Currently I stage binaries
6 > for multiple machines on a single nfs share, but the assumption is to
7 > use instead some distributed filesystem. So I think I just need a
8 > recipie, pointers or ideas on how to distribute emerge on an @world
9 > set? I am thinking granular first, ie per package rather than eg
10 > distributed gcc within a single package.
11
12 As others already pointed out, distcc introduces more headache then it
13 solves.
14
15 If you are searching for a solution due to performance of package
16 building, you get most profit from building on tmpfs.
17
18 Then, I also suggest going breadth first, thus building more packages
19 at the same time.
20
21 Your question implies depth first which means having more compiler
22 processes running at a time for a single package. But most build
23 processes do not scale out very well for the following reasons:
24
25 1. Configure phases are serial processes
26
27 2. Dependencies in Makefile are often buggy or incomplete
28
29 3. Dependencies between source files often allow parallel
30 building only for short burst throughout the complete
31 build and are serial otherwise
32
33 Building packages in parallel instead solves all these problems: Each
34 build phase can one in parallel to every other build phase. So while a
35 serialized configure phase is running or package is bundled/merged,
36 another package can have multiple gccs running while a third package
37 maybe builds serialized due to source file deps.
38
39 Also, emerge is very IO bound. Resorting to distcc won't solve this, as
40 a lot of compiler internals need to be copied back and forth between
41 the peers. It may even create more IO than building locally only. Using
42 tmpfs instead solves this much better.
43
44 I'm using the following settings and have 100% on all eight cores
45 almost all the time during emerge, while IO is idle most of the time:
46
47 MAKEOPTS="-s -j9 -l8"
48 FEATURES="sfperms parallel-fetch parallel-install protect-owned \
49 userfetch splitdebug fail-clean cgroup compressdebug buildpkg \
50 binpkg-multi-instance clean-logs userpriv usersandbox"
51 EMERGE_DEFAULT_OPTS="--binpkg-respect-use=y --binpkg-changed-deps=y \
52 --jobs=10 --load-average 8 --keep-going --usepkg"
53
54 $ fgrep portage /etc/fstab
55 none /var/tmp/portage tmpfs noauto,x-systemd.automount,x-systemd.idle-timeout=60,size=32G,mode=770,uid=portage,gid=portage
56
57 Have either enough swap or lower the tmpfs allocation.
58
59 Using FEATURES buildpkg pinpkg-multi-instance allows to reuse packages
60 on different but similar machines. EMERGE_DEFAULT_OPTS makes use of
61 this. /usr/portage/{distfiles,packages} is on shared media.
62
63 Also, I'm usually building world upgrades with --changed-deps to
64 rebuild dependers and update the bin packages that way.
65
66 I'm not sure, tho, if running emerge in parallel on two machines would
67 pickup newly appearing binpkgs during the process... I guess, not. I
68 usually don't do that except the dep tree looks independent between
69 both machines.
70
71 If your machine cannot saturate the CPU throughout the whole emerge
72 process (as long as there are parallel ebuild running), then distcc
73 will clearly not help you, make the complete process slower due to
74 waiting on remote resources, and even increase the load. Only very few,
75 huge projects, with Makefile deps very clearly optimized or specially
76 crafted for distributed builds can benefit from distcc. Most projects
77 aren't of this type, even Chromium and LibreOffice don't. Exactly,
78 those projects have way to much meta data to transport between the
79 distcc peers.
80
81 But YMMV. I'd say, try a different path first.
82
83
84 --
85 Regards,
86 Kai
87
88 Replies to list-only preferred.

Replies

Subject Author
[gentoo-user] Re: distributed emerge Kai Krakow <hurikhan77@×××××.com>