Gentoo Archives: gentoo-python

From: Mike Gilbert <floppym@g.o>
To: gentoo-python@l.g.o
Cc: python@g.o
Subject: [gentoo-python] Re: distutils-r1: a bit of clean up + parallel builds
Date: Thu, 29 Nov 2012 16:29:45
Message-Id: CAJ0EP42fXdG+pq=eMfTg8QktynWVNhZa60oga_Jp5Rw1dzvTDA@mail.gmail.com
In Reply to: [gentoo-python] distutils-r1: a bit of clean up + parallel builds by "Michał Górny"
1 On Thu, Nov 29, 2012 at 6:31 AM, Michał Górny <mgorny@g.o> wrote:
2 > 1) setup.py installs files to intermediate root (alike python.eclass).
3 >
4 > This way anything we do on the installed files doesn't collide with
5 > other merges potentially running in parallel. This also means that we
6 > don't have to delay installing the wrapper till all setup.py invocations
7 > have completed.
8 >
9 > This is done directly in distutils-r1_python_install. The setup.py is
10 > given a different --root, the renaming is done on intermediate image
11 > and the image is quickly merged to the destination.
12 >
13 > In order to perform the merge efficiently, I used:
14 >
15 > cp --archive --link --no-clobber
16 >
17 > so that copy should preserve everything and use hard-links whenever
18 > possible. --no-clobber is necessary to avoid error on colliding files
19 > (cp doesn't want to overwrite when hardlinking).
20 >
21 >
22 > 2) the wrapper is installed in distutils-r1_python_install.
23 >
24 > Previously, distutils-r1_python_install only renamed the installed
25 > executables (because of distutils no-clobber behavior),
26 > and distutils-r1_python_install_all installed the wrapper.
27 >
28 > Now we can install both in the same function, since distutils installs
29 > into intermediate images. Therefore, the wrapper being installed
30 > in another intermediate image or even the real image won't collide.
31 >
32 >
33 > 3) the sub-phases are run in parallel.
34 >
35 > Since distutils itself is unable to do parallel builds, building Python
36 > packages with C extensions for multiple Python implementations can get
37 > very slow. In order to circumvent that, we're using the multiprocessing
38 > eclass to run sub-phases in parallel.
39 >
40 > This means that with 4 implementations enabled and -j4, all four
41 > implementations will be built at the same time. And if they have C
42 > extensions, 4 source files will be built at the same time. This also
43 > makes it possible to use distcc.
44 >
45 > As stated in the last patch:
46 >
47 > dev-python/lxml-3.0.1 for py2.6+2.7+3.2+3.3:
48 >
49 > - non-parallel: 11 min 23 sec
50 > - parallel: 7 min 49 sec (with a bit of swapping)
51 > - parallel w/ distcc: 3 min 40 sec
52 >
53 > main machine: Core2 2x1.6 GHz and almost 2 GiBs of RAM
54 > distcc host: Athlon64 2x2 GHz and 3 GiBs of RAM
55 >
56
57 I was just thinking to myself last night that parallel builds/installs
58 would be nice. You must be psychic.
59
60 My thought was to add a flag/variable for python_foreach_impl, but
61 your method should work just as well.
62
63 Building the temporary image in sub-directories of ${D} feels a bit
64 strange, but I guess it works. I think distutils.eclass used ${T}?
65
66 Anyway, +1 from me.

Replies