Gentoo Archives: gentoo-alt

From: heroxbd@g.o
To: gentoo-alt@l.g.o
Subject: Re: [gentoo-alt] idea: use portage to build a scientific Python installation
Date: Sun, 22 Dec 2013 02:16:53
Message-Id: 868uvdsk8g.fsf@moguhome00.in.awa.tohoku.ac.jp
In Reply to: [gentoo-alt] idea: use portage to build a scientific Python installation by Martin Luessi
1 Dear Martin,
2
3 Martin Luessi <mluessi@×××××.com> writes:
4
5 > First, let me explain the reason for why anyone would want to do so.
6 > For work, I use Python extensively for scientific computing. However,
7 > I do not have administrator rights on my workstation and the
8 > distribution we use (CentOS) does not have the latest Python packages
9 > that are needed for scientific computing. In addition, even if CentOS
10 > had the packages, it wouldn't be feasible to constantly ask the
11 > sysadmins to install/update packages. One solution is to use a
12 > scientific Python distribution from a commercial vendor, e.g., Canopy
13 > from Enthought or Anaconda from Continuum Analytics. While these
14 > distributions work quite well, they are expensive for non-academic
15 > users and they are not very flexible, i.e., it can be difficult to
16 > install packages that are not in the package repository provided by
17 > the vendor, especially if the packages need additional dependencies. I
18 > also have a gentoo-prefix setup on my workstation.
19
20 Me too, I use Gentoo Prefix for Python-centered scientific computing on
21 the cluster of my institute.
22
23 > However, the whole prefix directory is very large as it makes minimal
24 > assumptions about the libraries provided by the host system. The size
25 > is a problem when using it over NFS e.g. on a cluster. Also, I have
26 > found that it is difficult to get X11 applications working as the
27 > gentoo-prefix will install its own X server etc.
28 >
29 > This made me wonder whether portage could be used to build a
30 > scientific Python installation. My idea is instead of making very
31 > minimal assumptions about the libraries provided by the host system
32 > (as done in a normal prefix install), one could generate a world file
33 > listing all the libraries provided by the host system and freeze their
34 > versions using package-mask. Like that, programs and libraries in the
35 > prefix would link to libraries on the host system whenever possible,
36 > which would make the prefix smaller. By having a gentoo based
37 > scientific Python installation, one could take advantage of all the
38 > packages provided by gentoo-science and it would make it easy to
39 > install Python packages that depend on non-Python libraries.
40 > What do you guys think, is this feasible?
41
42 Let me try to argue against it.
43
44 1. The disk space is extremely cheap now, $1/GB. Prefix will occupy at
45 most 5GB, with an average of 2GB and minimal of less than 1GB.
46
47 1a. NFS is not cool to throw the build directory onto.
48 What I do is to set PORTAGE_TMPDIR="/dev/shm" or whatever
49 tmpfs. Then you can achieve a modest speed of building.
50
51 2. We are actually doing the other way round: Isolate from the host
52 libraries as much as possible. We have even reached a (experimental)
53 stage where only the kernel of the host is used[a].
54
55 Why? Because trying to be compatible with a large range of versions
56 of libraries is not possible. Even the kernel version could break
57 something[b], and even the present Prefix get broken by some
58 unexpectedly behaved host libraries. Redhat build their product on
59 ancient software for a reason: stability.
60
61 My thought is to ignore the space Prefix occupies and focus on the
62 features, stability/maintainability instead.
63
64 Benda
65
66 a. http://blogs.gentoo.org/news/2013/11/01/gentoo-monthly-newsletter-31-october-2013/#RAP
67 b. https://bugs.gentoo.org/show_bug.cgi?id=493074