Gentoo Logo
Gentoo Spaceship




Note: Due to technical difficulties, the Archives are currently not up to date. GMANE provides an alternative service for most mailing lists.
c.f. bug 424647
List Archive: gentoo-dev
Navigation:
Lists: gentoo-dev: < Prev By Thread Next > < Prev By Date Next >
Headers:
To: gentoo-dev@g.o
From: Mike Frysinger <vapier@g.o>
Subject: Re: multiprocessing.eclass: doing parallel work in bash
Date: Sun, 3 Jun 2012 01:05:49 -0400
On Saturday 02 June 2012 19:59:02 Brian Harring wrote:
> On Fri, Jun 01, 2012 at 06:41:22PM -0400, Mike Frysinger wrote:
> > # @FUNCTION: multijob_post_fork
> > # @DESCRIPTION:
> > # You must call this in the parent process after forking a child process.
> > # If the parallel limit has been hit, it will wait for one to finish and
> > # return the child's exit status.
> > multijob_post_fork() {
> > 
> > 	[[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
> > 	
> > 	: $(( ++mj_num_jobs ))
> > 	
> > 	if [[ ${mj_num_jobs} -ge ${mj_max_jobs} ]] ; then
> > 	
> > 		multijob_finish_one
> > 	
> > 	fi
> > 	return $?
> > 
> > }
> 
> Minor note; the design of this (fork then check), means when a job
> finishes, we'll not be ready with more work.  This implicitly means
> that given a fast job identification step (main thread), and a slower
> job execution (what's backgrounded), we'll not breach #core of
> parallelism, nor will we achieve that level either (meaning
> potentially some idle cycles left on the floor).
> 
> Realistically, the main thread (what invokes post_fork) is *likely*,
> (if the consumer isn't fricking retarded) to be doing minor work-
> mostly just poking about figuring out what the next task/arguments
> are to submit to the pool.  That work isn't likely to be a full core
> worth of work, else as I said, the consumer is being a retard.
> 
> The original form of this was designed around the assumption that the
> main thread was light, and the backgrounded jobs weren't, thus it
> basically did the equivalent of make -j<cores>+1, allowing #cores
> background jobs running, while allowing the main thread to continue on
> and get the next job ready, once it had that ready, it would block
> waiting for a slot to open, then immediately submit the job once it
> had done a reclaim.

the original code i designed this around had a heavier main thread because it 
had series of parallel sections followed by serial followed by parallel where 
the serial regions didn't depend on the parallel finishing right away.  that 
and doing things post meant it was easier to pass up return values because i 
didn't have to save $? anywhere ;).

thinking a bit more, i don't think the two methods are mutually exclusive.  
it's easy to have the code support both, but i'm not sure the extended 
documentation helps.
-mike
Attachment:
signature.asc (This is a digitally signed message part.)
Replies:
Re: multiprocessing.eclass: doing parallel work in bash
-- Zac Medico
References:
Re: multiprocessing.eclass: doing parallel work in bash
-- Brian Harring
Navigation:
Lists: gentoo-dev: < Prev By Thread Next > < Prev By Date Next >
Previous by thread:
Re: multiprocessing.eclass: doing parallel work in bash
Next by thread:
Re: multiprocessing.eclass: doing parallel work in bash
Previous by date:
Re: metadata/md5-cache
Next by date:
Re: multiprocessing.eclass: doing parallel work in bash


Updated Jun 29, 2012

Summary: Archive of the gentoo-dev mailing list.

Donate to support our development efforts.

Copyright 2001-2013 Gentoo Foundation, Inc. Questions, Comments? Contact us.