1 |
Dear Martin, |
2 |
|
3 |
Martin Luessi <mluessi@×××××.com> writes: |
4 |
|
5 |
> First, let me explain the reason for why anyone would want to do so. |
6 |
> For work, I use Python extensively for scientific computing. However, |
7 |
> I do not have administrator rights on my workstation and the |
8 |
> distribution we use (CentOS) does not have the latest Python packages |
9 |
> that are needed for scientific computing. In addition, even if CentOS |
10 |
> had the packages, it wouldn't be feasible to constantly ask the |
11 |
> sysadmins to install/update packages. One solution is to use a |
12 |
> scientific Python distribution from a commercial vendor, e.g., Canopy |
13 |
> from Enthought or Anaconda from Continuum Analytics. While these |
14 |
> distributions work quite well, they are expensive for non-academic |
15 |
> users and they are not very flexible, i.e., it can be difficult to |
16 |
> install packages that are not in the package repository provided by |
17 |
> the vendor, especially if the packages need additional dependencies. I |
18 |
> also have a gentoo-prefix setup on my workstation. |
19 |
|
20 |
Me too, I use Gentoo Prefix for Python-centered scientific computing on |
21 |
the cluster of my institute. |
22 |
|
23 |
> However, the whole prefix directory is very large as it makes minimal |
24 |
> assumptions about the libraries provided by the host system. The size |
25 |
> is a problem when using it over NFS e.g. on a cluster. Also, I have |
26 |
> found that it is difficult to get X11 applications working as the |
27 |
> gentoo-prefix will install its own X server etc. |
28 |
> |
29 |
> This made me wonder whether portage could be used to build a |
30 |
> scientific Python installation. My idea is instead of making very |
31 |
> minimal assumptions about the libraries provided by the host system |
32 |
> (as done in a normal prefix install), one could generate a world file |
33 |
> listing all the libraries provided by the host system and freeze their |
34 |
> versions using package-mask. Like that, programs and libraries in the |
35 |
> prefix would link to libraries on the host system whenever possible, |
36 |
> which would make the prefix smaller. By having a gentoo based |
37 |
> scientific Python installation, one could take advantage of all the |
38 |
> packages provided by gentoo-science and it would make it easy to |
39 |
> install Python packages that depend on non-Python libraries. |
40 |
> What do you guys think, is this feasible? |
41 |
|
42 |
Let me try to argue against it. |
43 |
|
44 |
1. The disk space is extremely cheap now, $1/GB. Prefix will occupy at |
45 |
most 5GB, with an average of 2GB and minimal of less than 1GB. |
46 |
|
47 |
1a. NFS is not cool to throw the build directory onto. |
48 |
What I do is to set PORTAGE_TMPDIR="/dev/shm" or whatever |
49 |
tmpfs. Then you can achieve a modest speed of building. |
50 |
|
51 |
2. We are actually doing the other way round: Isolate from the host |
52 |
libraries as much as possible. We have even reached a (experimental) |
53 |
stage where only the kernel of the host is used[a]. |
54 |
|
55 |
Why? Because trying to be compatible with a large range of versions |
56 |
of libraries is not possible. Even the kernel version could break |
57 |
something[b], and even the present Prefix get broken by some |
58 |
unexpectedly behaved host libraries. Redhat build their product on |
59 |
ancient software for a reason: stability. |
60 |
|
61 |
My thought is to ignore the space Prefix occupies and focus on the |
62 |
features, stability/maintainability instead. |
63 |
|
64 |
Benda |
65 |
|
66 |
a. http://blogs.gentoo.org/news/2013/11/01/gentoo-monthly-newsletter-31-october-2013/#RAP |
67 |
b. https://bugs.gentoo.org/show_bug.cgi?id=493074 |