Gentoo Archives: gentoo-portage-dev

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-portage-dev@l.g.o
Subject: [gentoo-portage-dev] Re: portage-2.2-rc3 parallel merges quit being parallel
Date: Fri, 09 Apr 2010 15:36:29
Message-Id: pan.2010.04.09.15.35.21@cox.net
In Reply to: Re: [gentoo-portage-dev] portage-2.2-rc3 parallel merges quit being parallel by Zac Medico
1 Followup to a rather old post...
2
3 Zac Medico posted on Sat, 26 Jul 2008 17:00:18 -0700 as excerpted:
4
5 > Duncan wrote:
6 >> For the first 100 or so packages, it worked quite well. However, about
7 >> there, maybe package 120 or so, so about 20% of the way thru, it
8 >> reverted to doing them one-at-a-time again.
9
10 FWIW, that has long since been fixed. =:^)
11
12 >> Finally... I was rather confused the first time at just one job an
13 >> install took a bit, as that's apparently not counted as "running", so
14 >> it appeared nothing was going on for a bit. Maybe an "installing"
15 >> count as well would be useful... and prevent that confusion.
16 >
17 > There used to be a "merges" count in the status display but somebody
18 > thought it was confusing (darkside/Jeremy Olexa) and I decided that it
19 > wasn't interesting enough to be worthy of it's display space so I
20 > removed it. I guess we can add it back if there's space and demand for
21 > it. Maybe it should only be shown when the job count drops to zero?
22
23 From my point of view, the current display (as of portage-2.2_rc67) is
24 still lacking/confusing in this regard. What I see happening here is that
25 a number of packages will be building, then finish building one at a time,
26 but the completed count doesn't go up. They seem to sit there built but
27 not installing for quite awhile, then all install at once. But it's hard
28 to tell as there's not an "installing" count.
29
30 Now part of this may be due to the way I have jobs setup.
31
32 MAKEOPTS="-j13 -l10"
33
34 My emerge jobs, OTOH, are
35
36 "--jobs=4 --load-average=7"
37
38 (BTW, it would sure be nice if there was a make.conf variable for those,
39 similar to MAKEOPTS, and then a simple command-line option to toggle it on
40 and off. I don't put them in my default options as when I'm
41 troubleshooting a broken merge or am otherwise just merging a single
42 package, it's nice to be able to run non-parallel jobs and thus have the
43 output "live", but when I'm doing full updates, I want the parallel jobs.
44 The logical way to handle this would be to set the --jobs and
45 --load-average parameters in a var similar to MAKEOPTS, and then use a
46 command-line option to toggle parallel jobs mode on or off.)
47
48 Meanwhile...
49
50 The rational behind having portage jobs lower than makeopts is that since
51 I have PORTAGE_TMPDIR pointed at a tmpfs, I want to favor currently
52 running package emerges over starting new ones, because every new package
53 started unpacks into that tmpfs, thereby using more memory. Thus, if it's
54 possible to run more parallel jobs on already started package merges, I
55 want it to do that instead of starting more packages. The way to do that
56 is to keep the number of --jobs and --load-average lower than the
57 corresponding MAKEOPTS parameters, so MAKEOPTS gets used first if possible
58 and only if there's no more parallel jobs possible at that level, does the
59 load average drop down far enough for another package to start merging
60 beside the currently merging ones.
61
62 But for whatever reason, portage seems to sit there doing nothing with the
63 already built and ready to install packages, preferring to start more
64 package builds rather than finish off the installs of the ones it already
65 has built. So I can end up with 7 or 8 packages sitting there built, but
66 apparently not installing, then it goes and installs them all at once!
67 That's contrary to my strategy of trying to favor already started
68 packages, and only starting new ones when the existing running ones can't
69 keep the load average up high enough.
70
71 But as I said it's hard to track that, since portage doesn't track the
72 current number of installings, only the number built. I can however infer
73 from the difference between the number of the last started job, the number
74 of completed jobs, and the number of running jobs, plus the per-package
75 installing notices, that a whole stack of packages are accumulating that
76 are already built, but are apparently not installing yet. Why isn't
77 portage going ahead and installing them, instead of starting new package
78 builds?
79
80 So the two issues (plus the one in parentheses above) I see are:
81
82 1) the parallel jobs display needs to say how many it's installing.
83
84 2) portage needs to follow thru on built packages and finish installing
85 them as a higher job priority than starting new package builds.
86 Particularly with --jobs=4, there's no reason portage should be starting a
87 new package build, when there's a whole stack of packages (often more than
88 the four the --jobs=4 implies should be the max) apparently sitting there
89 built and ready to install, but not actively installing, and apparently
90 not doing anything except sitting there taking up tmpfs memory!
91
92 3) Having an EJOBS or similar variable, parallel to the MAKEOPTS variable,
93 and then a simple parallel-toggle emerge command-line option, would be
94 quite useful. =:^)
95
96 Let me know if you'd prefer that I file bugs on these, and as always,
97 thanks, Zac, for being so responsive. You have a gift that's a real
98 rarity in user/dev relations, and some of us really do appreciate it! =:^)
99
100 --
101 Duncan - List replies preferred. No HTML msgs.
102 "Every nonfree program has a lord, a master --
103 and if you use the program, he is your master." Richard Stallman