Gentoo Archives: gentoo-dev

From: Mike Frysinger <vapier@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash
Date: Sun, 03 Jun 2012 05:06:43
Message-Id: 201206030105.50520.vapier@gentoo.org
In Reply to: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash by Brian Harring
1 On Saturday 02 June 2012 19:59:02 Brian Harring wrote:
2 > On Fri, Jun 01, 2012 at 06:41:22PM -0400, Mike Frysinger wrote:
3 > > # @FUNCTION: multijob_post_fork
4 > > # @DESCRIPTION:
5 > > # You must call this in the parent process after forking a child process.
6 > > # If the parallel limit has been hit, it will wait for one to finish and
7 > > # return the child's exit status.
8 > > multijob_post_fork() {
9 > >
10 > > [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
11 > >
12 > > : $(( ++mj_num_jobs ))
13 > >
14 > > if [[ ${mj_num_jobs} -ge ${mj_max_jobs} ]] ; then
15 > >
16 > > multijob_finish_one
17 > >
18 > > fi
19 > > return $?
20 > >
21 > > }
22 >
23 > Minor note; the design of this (fork then check), means when a job
24 > finishes, we'll not be ready with more work. This implicitly means
25 > that given a fast job identification step (main thread), and a slower
26 > job execution (what's backgrounded), we'll not breach #core of
27 > parallelism, nor will we achieve that level either (meaning
28 > potentially some idle cycles left on the floor).
29 >
30 > Realistically, the main thread (what invokes post_fork) is *likely*,
31 > (if the consumer isn't fricking retarded) to be doing minor work-
32 > mostly just poking about figuring out what the next task/arguments
33 > are to submit to the pool. That work isn't likely to be a full core
34 > worth of work, else as I said, the consumer is being a retard.
35 >
36 > The original form of this was designed around the assumption that the
37 > main thread was light, and the backgrounded jobs weren't, thus it
38 > basically did the equivalent of make -j<cores>+1, allowing #cores
39 > background jobs running, while allowing the main thread to continue on
40 > and get the next job ready, once it had that ready, it would block
41 > waiting for a slot to open, then immediately submit the job once it
42 > had done a reclaim.
43
44 the original code i designed this around had a heavier main thread because it
45 had series of parallel sections followed by serial followed by parallel where
46 the serial regions didn't depend on the parallel finishing right away. that
47 and doing things post meant it was easier to pass up return values because i
48 didn't have to save $? anywhere ;).
49
50 thinking a bit more, i don't think the two methods are mutually exclusive.
51 it's easy to have the code support both, but i'm not sure the extended
52 documentation helps.
53 -mike

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies