Gentoo Archives: gentoo-dev

From: "Gregory M. Turner" <gmt@×××××.us>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] EJOBS variable for EAPI 5?
Date: Wed, 12 Sep 2012 18:54:18
Message-Id: 5050D9F9.8070002@malth.us
In Reply to: Re: [gentoo-dev] EJOBS variable for EAPI 5? by Ian Stakenvicius
1 On 9/12/2012 5:58 AM, Ian Stakenvicius wrote:
2 > -----BEGIN PGP SIGNED MESSAGE-----
3 > Hash: SHA256
4 >
5 > On 12/09/12 05:55 AM, Gregory M. Turner wrote:
6 >>
7 >> Note that, effectively, we have this already, and it's called
8 >> "portage". But one could certainly make a case for modularizing it
9 >> better, since, in truth, we are talking about a very common, very
10 >> abstract problem here which portage shares with any number of
11 >> batch-build systems.
12 >>
13 >> Such an engine could very well do exactly the right thing if it
14 >> were faced with a constraint that a certain part of a certain build
15 >> needed to proceed without parallelism due to limitations coming
16 >> from the build.
17 >>
18 >> Also, there are very large parts of most builds -- configure comes
19 >> to mind -- that don't parallelize even if, perhaps, they should.
20 >> In such cases, a really smart global parallelism arbiter could
21 >> easily respond by spawning more jobs from other builds.
22 >>
23 >
24 > So essentially what you're saying here is that it might be worthwhile
25 > to look into parallelism as a whole and possibly come up with a
26 > solution that combines 'emerge --jobs' and build-system parallelism
27 > together to maximum benefit?
28
29 Yeah, couldn't have said it better myself ... apparently :)
30
31 > Advanced HPC systems (sys-cluster/torque along with an appropriate
32 > scheduler, for instance) can do such things with their jobs when the
33 > jobs are properly built; I could see portage being able to handle this
34 > as well given most of what is necessary is already known (ebuild
35 > phases, build system type (via eclass), etc). However, given the
36 > limitations already put on parallelism in terms of emerge order, etc,
37 > I could see this solution needing to be -very- complex and integration
38 > needing to occur on multiple levels. We'd also need to consider
39 > distcc (and other cluster-shared compilation methods if there are
40 > any??).. It would be an interesting project, though.
41
42 ACK all of the above.
43
44 Tempting to think more deeply about this but probably the last thing I
45 need to do right now is to talk myself into another speculative project.
46
47 I've hurt my wrist a bit -- probably an RSI -- so should help deter me :S
48
49 Only a few major sources of parallelism exist in portage: --jobs /
50 --load-average in emerge opts, multiprocessing eclass & equiv. ebuild
51 helper, distcc, and make... Infrastructure is already in place for all
52 of those, so perhaps a good holistic solution exists that isn't /too/
53 complicated.
54
55 ...OK another f!#!%$^ brainstorm incoming :)
56
57 For "JOBS" syntax... what really seems missing in portage are:
58
59 o a clean way to say "dont parallelize this particular make
60 invocation" in ebuilds
61
62 o a clean way to globally say "try to use this parallelization
63 strategy when emerging."
64
65 So what about something like:
66
67 o EMERGE_JOBS and EMERGE_LOAD_AVERAGE make.conf vars equiv. to
68 --jobs and --load-average emerge options
69
70 o EBUILD_JOBS and EBUILD_LOAD_AVERAGE make.conf vars
71
72 o If the latter are not specified, they are copied respectively from
73 the former (debatable for *_JOBS, since now we get 16 processes when
74 we asked for four).
75
76 o MAKEOPTS is auto-extended to reflect EBUILD_JOBS/EBUILD_LOAD_AVERAGE
77 if & only if -j|--jobs|-l|--load-average options aren't provided in
78 make.conf/profile/envvar MAKEOPTS
79
80 o however, if MAKEOPTS "override" EBUILD_JOBS or EBUILD_LOAD_AVERAGE,
81 issue a conspicuous yellow-stars warning
82
83 o extend "emake" to accept a "--non-parallel" option which will
84 strip all -j|--jobs|-l|--load-average options from MAKEOPTS;
85 perhaps support an equivalent EBUILD_NON_PARALLEL envvar as well,
86 with support for override in profile.bashrc. Don't warn about this
87 overriding EBUILD_JOBS -- treat as SOP.
88
89 o debatable: respect EBUILD_NON_PARALLEL in multiprocessing, etc?
90 or, perhaps, something like:
91
92 EMAKE_NON_PARALLEL=${EMAKE_NON_PARALLEL:-${EBUILD_NON_PARALLEL:-no}}
93
94 could be used to distinguish between "don't use any parallelism"
95 and "don't use GNU's make parallelism in emake". Also maybe a
96 better name exists that doesn't use double-negatives.
97
98 ?
99
100 Seems to me something vaguely like the above would provide
101
102 o backward compatibility for ebuilds and make.conf
103
104 o not so vastly different than what we have
105
106 o a decent way to specify what "we really want" globally;
107 insofar as portage doesn't do the best job effecting the requested
108 parallelization strategy, more ambitious tactics could be
109 implemented later, hopefully without huge interface revisions.
110
111 -gmt
112
113 P.S.:
114
115 (Kind-of-crazy additional idea: put ceil(sqrt(EMERGE_JOBS)) into
116 EBUILD_JOBS when only the former is specified, and then let
117 effective_emerge_jobs equal floor(EMERGE_JOBS/EBUILD_JOBS).... but maybe
118 too much automagic for this to be a good idea.)