Gentoo Archives: gentoo-cluster

From: Justin Bronder <jsbronder@g.o>
To: gentoo-cluster@l.g.o
Subject: [gentoo-cluster] Re: Installing and using multiple MPI implementations at the same time.
Date: Tue, 11 Mar 2008 01:10:20
Message-Id: 20080311010732.GA26618@mejis.cold-front
In Reply to: Re: [gentoo-cluster] Installing and using multiple MPI implementations at the same time. by Alexander Piavka
1 On 10/03/08 18:31 +0200, Alexander Piavka wrote:
2 >
3 > Hi Justin,
4 >
5 > I've started playing with your empi implementation.
6 >
7 > Some problems & suggestions:
8 >
9 > 1)'eselect mpi set ...' does not check for existance of ~/.env.d dir
10 > and fails if one does not exists.
11
12 Fixed in eselect-mpi-0.0.2
13
14 >
15 > It creates ~/.env.d/mpi which looks like this:
16 > ----------------------
17 > PATH="/usr/lib64/mpi/mpi-openmpi/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.1.2:/opt/blackdown-jdk-1.4.2.03/bin:/opt/blackdown-jdk-1.4.2.03/jre/bin"
18 > MANPATH="/usr/lib64/mpi/mpi-openmpi/usr/share/man:/etc/java-config-2/current-system-vm/man:/usr/local/share/man:/usr/share/man:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.18/man:/usr/share/gcc-data/x86_64-pc-linux-gnu/4.1.2/man:/opt/blackdown-jdk-1.4.2.03/man:/etc/java-config/system-vm/man/"
19 > LD_LIBRARY_PATH="/usr/lib64/mpi/mpi-openmpi/usr/lib64:"
20 > ESELECT_MPI_IMP="mpi-openmpi"
21 > export LD_LIBRARY_PATH
22 > export PATH
23 > export MANPATH
24 > export ESELECT_MPI_IMP
25 > ----------------------
26 >
27 > while the following would be better:
28 > ----------------------
29 > PATH="/usr/lib64/mpi/mpi-openmpi/usr/bin:${PATH}"
30 > MANPATH="/usr/lib64/mpi/mpi-openmpi/usr/share/man:${MANPATH}"
31 > LD_LIBRARY_PATH="/usr/lib64/mpi/mpi-openmpi/usr/lib64:${LD_LIBRARY_PATH}"
32 > ESELECT_MPI_IMP="mpi-openmpi"
33 > export LD_LIBRARY_PATH
34 > export PATH
35 > export MANPATH
36 > export ESELECT_MPI_IMP
37 > ----------------------
38 >
39 > maybe even
40 > ----------------------
41 > if [ "X${PATH}" != "X" ]; then
42 > export PATH="/usr/lib64/mpi/mpi-openmpi/usr/bin:${PATH}"
43 > else
44 > export PATH="/usr/lib64/mpi/mpi-openmpi/usr/bin"
45 > fi
46 > if [ "X${MANPATH}" != "X" ]; then
47 > export MANPATH="/usr/lib64/mpi/mpi-openmpi/usr/share/man:${MANPATH}"
48 > else
49 > export MANPATH="/usr/lib64/mpi/mpi-openmpi/usr/share/man"
50 > fi
51 > if [ "X${LD_LIBRARY_PATH}" != "X" ]; then
52 > export
53 > LD_LIBRARY_PATH="/usr/lib64/mpi/mpi-openmpi/usr/lib64:${LD_LIBRARY_PATH}"
54 > else
55 > export LD_LIBRARY_PATH="/usr/lib64/mpi/mpi-openmpi/usr/lib64"
56 > fi
57 > export ESELECT_MPI_IMP
58 > ----------------------
59
60 Yeah, you're probably right. However, I need a way to deal with cleaning out
61 the environment when the user calls the unset action, or changes from one
62 implementation to the other. Using what you have above, if the user then
63 called 'eselect mpi set mpi-lam' and sourced ~/.env.d/mpi, they would first
64 have the correct paths for mpi-openmpi followed by the ones for mpi-lam in
65 their environment. See below for why this scares me.
66
67 >
68 > Also, probably . besides /etc/env.d/mpi/mpi-openmpi the /etc/env.d/XXmpi
69 > file should also be created with the default empi profile then 'eselect mpi
70 > set <mpi-implementation>'
71 > is run.
72
73 I'm willing to be told why I'm wrong, but I left out the above for what I
74 believe is a good reason. If you set say, openmpi to be your default
75 implementation on the system level, then a user eselects lam-mpi, the user
76 will still have mpif90 in their path. This is a big deal because lam-mpi
77 does not provide bindings for f90, hence the user could quite quickly become
78 confused as to why their code using f90 and c is in shambles when they try to
79 compile.
80
81 The above can still happen if openmpi is emerged normally. I have no clue
82 how to deal with that yet either.
83
84 If we keep the ugly ~/.env.d/mpi file, along with the environment stripping
85 ability, there is no reason that a global mpi profile couldn't be used. What
86 do you think?
87
88 >
89 > 2)another problem is a failure to install binpkg of openmpi on another
90 > identical systems, the error is
91 >
92 > *
93 > * ERROR: mpi-openmpi/openmpi-1.2.5-r1 failed.
94 > * Call stack:
95 > * ebuild.sh, line 1717: Called dyn_setup
96 > * ebuild.sh, line 768: Called qa_call 'pkg_setup'
97 > * ebuild.sh, line 44: Called pkg_setup
98 > * openmpi-1.2.5-r1.ebuild, line 23: Called mpi_pkg_setup
99 > * mpi.eclass, line 306: Called die
100 > * The specific snippet of code:
101 > * [[ -f "${FILESDIR}"/${MPI_ESELECT_FILE} ]] \
102 > * || die "MPI_ESELECT_FILE is not defined/found.
103 > ${MPI_ESELECT_FILE}"
104 > * The die message:
105 > * MPI_ESELECT_FILE is not defined/found. eselect.mpi.openmpi
106 > *
107 > * If you need support, post the topmost build error, and the call stack if
108 > relevant.
109 > * A complete build log is located at
110 > '/var/tmp/portage/mpi-openmpi/openmpi-1.2.5-r1/temp/build.log'.
111 > *
112 >
113 > I thinks this is due to MPI_ESELECT_FILE being defined in pkg_setup() of
114 > openmpi ebuild and not in top of ebuild (will check if this would help
115 > later)
116
117 Foolish mistake on my part. MPI_ESELECT_FILE can be defined in pkg_setup as
118 that always gets called (I believe). However I can't check that file there
119 as emerging binpkgs doesn't give access to FILESDIR. I've committed a fix to
120 the overlay.
121
122 >
123 > 3) If i have PORTDIR_OVERLAY="/usr/local/overlays/csbgu /tmp/empi"
124 > empi --create --implementation mpi-openmpi =sys-cluster/openmpi-1.2.5-r1
125 > would create mpi-openmpi category tree under
126 > /usr/local/overlays/csbgu/mpi-openmpi
127 > since it's first overlay in PORTDIR_OVERLAY, it would be nice if it could
128 > ALWAYS be created under the empi overlay i.e /tmp/empi/mpi-openmpi
129 > Of couse i can put the empi overlay first in PORTDIR_OVERLAY instead
130 > but i want to avoid manual tweaking as much as possible.
131 > With all mpi-implementation residing in the same overlay tree as empi
132 > it would much more convienient , for me, to auto distribute single overlay
133 > among cluster hosts and avoid possible need for commands
134 > like 'empi --create --implementation mpi-openmpi ...'
135
136 Also fixed, added the --overlaydir to the command line arguments.
137
138 Thanks for trying this out, it makes me feel useful :)
139
140 --
141 Justin Bronder