Gentoo Archives: gentoo-dev

From: Brian Harring <ferringb@×××××.com>
To: Mike Frysinger <vapier@g.o>
Cc: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash
Date: Sat, 02 Jun 2012 23:59:45
Message-Id: 20120602235902.GC9296@localhost
In Reply to: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash by Mike Frysinger
1 On Fri, Jun 01, 2012 at 06:41:22PM -0400, Mike Frysinger wrote:
2 > # @FUNCTION: multijob_post_fork
3 > # @DESCRIPTION:
4 > # You must call this in the parent process after forking a child process.
5 > # If the parallel limit has been hit, it will wait for one to finish and
6 > # return the child's exit status.
7 > multijob_post_fork() {
8 > [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
9 >
10 > : $(( ++mj_num_jobs ))
11 > if [[ ${mj_num_jobs} -ge ${mj_max_jobs} ]] ; then
12 > multijob_finish_one
13 > fi
14 > return $?
15 > }
16
17 Minor note; the design of this (fork then check), means when a job
18 finishes, we'll not be ready with more work. This implicitly means
19 that given a fast job identification step (main thread), and a slower
20 job execution (what's backgrounded), we'll not breach #core of
21 parallelism, nor will we achieve that level either (meaning
22 potentially some idle cycles left on the floor).
23
24 Realistically, the main thread (what invokes post_fork) is *likely*,
25 (if the consumer isn't fricking retarded) to be doing minor work-
26 mostly just poking about figuring out what the next task/arguments
27 are to submit to the pool. That work isn't likely to be a full core
28 worth of work, else as I said, the consumer is being a retard.
29
30 The original form of this was designed around the assumption that the
31 main thread was light, and the backgrounded jobs weren't, thus it
32 basically did the equivalent of make -j<cores>+1, allowing #cores
33 background jobs running, while allowing the main thread to continue on
34 and get the next job ready, once it had that ready, it would block
35 waiting for a slot to open, then immediately submit the job once it
36 had done a reclaim.
37
38 On the surface of it, it's a minor difference, but having the next
39 job immediately ready to fire makes it easier to saturate cores.
40
41 Unfortunately, that also changes your API a bit; your call.
42
43 ~harring

Replies