Gentoo Archives: gentoo-dev

From: Mike Frysinger <vapier@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash
Date: Sat, 02 Jun 2012 04:58:52
Message-Id: 201206020057.40625.vapier@gentoo.org
In Reply to: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash by Brian Harring
1 On Saturday 02 June 2012 00:11:19 Brian Harring wrote:
2 > On Fri, Jun 01, 2012 at 06:41:22PM -0400, Mike Frysinger wrote:
3 > > and put it into a new multiprocessing.eclass. this way people can
4 > > generically utilize this in their own eclasses/ebuilds.
5 > >
6 > > it doesn't currently support nesting. not sure if i should fix that.
7 > >
8 > > i'll follow up with an example of parallelizing of eautoreconf. for
9 > > mail-filter/maildrop on my 4 core system, it cuts the time needed to run
10 > > from ~2.5 min to ~1 min.
11 >
12 > My main concern here is cleanup during uncontrolled shutdown; if the
13 > backgrounded job has hung itself for some reason, the job *will* just
14 > sit; I'm not aware of any of the PMs doing process tree killing, or
15 > cgroups containment; in my copious free time I'm planning on adding a
16 > 'cjobs' tool for others, and adding cgroups awareness into pkgcore;
17 > that said, none of 'em do this *now*, thus my concern.
18
19 i'm not sure there's much i can do here beyond adding traps
20
21 > > makeopts_jobs() {
22 >
23 > This function belongs in eutils, or somewhere similar- pretty sure
24 > we've got variants of this in multiple spots. I'd prefer a single
25 > point to change if/when we add a way to pass parallelism down into the
26 > env via EAPI.
27
28 it's already in eutils. but i'm moving it out of that and into this since it
29 makes more sense in this eclass imo, and avoids this eclass from inheriting
30 eutils.
31
32 > > multijob_child_init() {
33 > > [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
34 > > trap 'echo ${BASHPID} $? >&'${mj_control_fd} EXIT
35 > > trap 'exit 1' INT TERM
36 > > }
37 >
38 > Kind of dislike this form since it means consuming code has to be
39 > aware of, and do the () & trick.
40 >
41 > A helper function, something like
42 > multijob_child_job() {
43 > (
44 > multijob_child_init
45 > "$@"
46 > ) &
47 > multijob_post_fork || die "game over man, game over"
48 > }
49 >
50 > Doing so, would conver your eautoreconf from:
51 > for x in $(autotools_check_macro_val AC_CONFIG_SUBDIRS) ; do
52 > if [[ -d ${x} ]] ; then
53 > pushd "${x}" >/dev/null
54 > (
55 > multijob_child_init
56 > AT_NOELIBTOOLIZE="yes" eautoreconf
57 > ) &
58 > multijob_post_fork || die
59 > popd >/dev/null
60 > fi
61 > done
62 >
63 > To:
64 > for x in $(autotools_check_macro_val AC_CONFIG_SUBDIRS) ; do
65 > if [[ -d ${x} ]]; then
66 > pushd "${x}" > /dev/null
67 > AT_NOELIBTOOLIZE="yes" multijob_child_job eautoreconf
68 > popd
69 > fi
70 > done
71
72 it depends on the form of the code. i can see both being useful. should be
73 easy to support both though:
74 multijob_child_init() {
75 if [[ $# -eq 0 ]] ; then
76 trap 'echo ${BASHPID} $? >&'${mj_control_fd} EXIT
77 trap 'exit 1' INT TERM
78 else
79 (
80 multijob_child_init
81 "$@"
82 ) &
83 multijob_post_fork || die
84 fi
85 }
86
87 > Note, if we used an eval in multijob_child_job, the pushd/popd could
88 > be folded in. Debatable.
89
90 i'd lean towards not. keeps things simple and people don't have to get into
91 quoting hell.
92
93 > > # @FUNCTION: multijob_finish_one
94 > > # @DESCRIPTION:
95 > > # Wait for a single process to exit and return its exit code.
96 > > multijob_finish_one() {
97 > >
98 > > [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
99 > >
100 > > local pid ret
101 > > read -r -u ${mj_control_fd} pid ret
102 >
103 > Mildly concerned about the failure case here- specifically if the read
104 > fails (fd was closed, take your pick).
105
106 read || die ? not sure what else could be done really.
107
108 > > multijob_finish() {
109 > > [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments"
110 >
111 > Tend to think this should do cleanup, then die if someone invoked the
112 > api incorrectly; I'd rather see the children reaped before this blows
113 > up.
114
115 sounds good. along those lines, i could add multijob_finish to
116 EBUILD_DEATH_HOOKS so other `die` points also wait by default ...
117 -mike

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash Cyprien Nicolas <c.nicolas@×××××.com>