Gentoo Archives: gentoo-dev

From: Zac Medico <zmedico@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash
Date: Sun, 03 Jun 2012 06:54:12
Message-Id: 4FCB09D2.6040904@gentoo.org
In Reply to: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash by Mike Frysinger
1 On 06/02/2012 10:05 PM, Mike Frysinger wrote:
2 > On Saturday 02 June 2012 19:59:02 Brian Harring wrote:
3 >> On Fri, Jun 01, 2012 at 06:41:22PM -0400, Mike Frysinger wrote:
4 >>> # @FUNCTION: multijob_post_fork
5 >>> # @DESCRIPTION:
6 >>> # You must call this in the parent process after forking a child process.
7 >>> # If the parallel limit has been hit, it will wait for one to finish and
8 >>> # return the child's exit status.
9 >>> multijob_post_fork() {
10 >>>
11 >>> [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
12 >>>
13 >>> : $(( ++mj_num_jobs ))
14 >>>
15 >>> if [[ ${mj_num_jobs} -ge ${mj_max_jobs} ]] ; then
16 >>>
17 >>> multijob_finish_one
18 >>>
19 >>> fi
20 >>> return $?
21 >>>
22 >>> }
23 >>
24 >> Minor note; the design of this (fork then check), means when a job
25 >> finishes, we'll not be ready with more work. This implicitly means
26 >> that given a fast job identification step (main thread), and a slower
27 >> job execution (what's backgrounded), we'll not breach #core of
28 >> parallelism, nor will we achieve that level either (meaning
29 >> potentially some idle cycles left on the floor).
30 >>
31 >> Realistically, the main thread (what invokes post_fork) is *likely*,
32 >> (if the consumer isn't fricking retarded) to be doing minor work-
33 >> mostly just poking about figuring out what the next task/arguments
34 >> are to submit to the pool. That work isn't likely to be a full core
35 >> worth of work, else as I said, the consumer is being a retard.
36 >>
37 >> The original form of this was designed around the assumption that the
38 >> main thread was light, and the backgrounded jobs weren't, thus it
39 >> basically did the equivalent of make -j<cores>+1, allowing #cores
40 >> background jobs running, while allowing the main thread to continue on
41 >> and get the next job ready, once it had that ready, it would block
42 >> waiting for a slot to open, then immediately submit the job once it
43 >> had done a reclaim.
44 >
45 > the original code i designed this around had a heavier main thread because it
46 > had series of parallel sections followed by serial followed by parallel where
47 > the serial regions didn't depend on the parallel finishing right away. that
48 > and doing things post meant it was easier to pass up return values because i
49 > didn't have to save $? anywhere ;).
50 >
51 > thinking a bit more, i don't think the two methods are mutually exclusive.
52 > it's easy to have the code support both, but i'm not sure the extended
53 > documentation helps.
54
55 Can't you just add a multijob_pre_fork function and do your waiting in
56 there instead of in the multijob_post_fork function?
57 --
58 Thanks,
59 Zac