Gentoo Archives: gentoo-dev

From: Jason Zaman <perfinion@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Bazel Build eclass
Date: Sun, 18 Nov 2018 07:38:01
Message-Id: 20181118073745.GA14571@baraddur.perfinion.com
In Reply to: Re: [gentoo-dev] Bazel Build eclass by "Michał Górny"
1 On Sat, Nov 17, 2018 at 11:54:24PM +0100, Michał Górny wrote:
2 > On Sun, 2018-11-18 at 03:37 +0800, Jason Zaman wrote:
3 > > Hey all,
4 > >
5 > > I've been using Bazel (https://bazel.build/) to build TensorFlow for a
6 > > while now. Here is a bazel.eclass I'd like to commit to make it easier
7 > > for packages that use it to build. It's basically bits that I've
8 > > refactored out of the TensorFlow ebuild that would be useful to other
9 > > packages as well. I have a bump to sci-libs/tensorflow-1.12.0 prepared
10 > > that uses this eclass and have tested a full install.
11 > >
12 > > -- Jason
13 > >
14 > > # Copyright 1999-2018 Jason Zaman
15 > > # Distributed under the terms of the GNU General Public License v2
16 > >
17 > > # @ECLASS: bazel.eclass
18 > > # @MAINTAINER:
19 > > # Jason Zaman <perfinion@g.o>
20 > > # @AUTHOR:
21 > > # Jason Zaman <perfinion@g.o>
22 > > # @BLURB: Utility functions for packages using Bazel Build
23 > > # @DESCRIPTION:
24 > > # A utility eclass providing functions to run the Bazel Build system.
25 > > #
26 > > # This eclass does not export any phase functions.
27 > >
28 > > case "${EAPI:-0}" in
29 > > 0|1|2|3|4|5|6)
30 > > die "Unsupported EAPI=${EAPI:-0} (too old) for ${ECLASS}"
31 > > ;;
32 > > 7)
33 > > ;;
34 > > *)
35 > > die "Unsupported EAPI=${EAPI} (unknown) for ${ECLASS}"
36 > > ;;
37 > > esac
38 > >
39 > > if [[ ! ${_BAZEL_ECLASS} ]]; then
40 > >
41 > > inherit multiprocessing toolchain-funcs
42 > >
43 > > BDEPEND=">=dev-util/bazel-0.19"
44 > >
45 > > # @FUNCTION: bazel_get_flags
46 > > # @DESCRIPTION:
47 > > # Obtain and print the bazel flags for target and host *FLAGS.
48 > > #
49 > > # To add more flags to this, append the flags to the
50 > > # appropriate variable before calling this function
51 > > bazel_get_flags() {
52 > > local i fs=()
53 > > for i in ${CFLAGS}; do
54 > > fs+=( "--conlyopt=${i}" )
55 > > done
56 > > for i in ${BUILD_CFLAGS}; do
57 > > fs+=( "--host_conlyopt=${i}" )
58 > > done
59 > > for i in ${CXXFLAGS}; do
60 > > fs+=( "--cxxopt=${i}" )
61 > > done
62 > > for i in ${BUILD_CXXFLAGS}; do
63 > > fs+=( "--host_cxxopt=${i}" )
64 > > done
65 > > for i in ${CPPFLAGS}; do
66 > > fs+=( "--conlyopt=${i}" "--cxxopt=${i}" )
67 > > done
68 > > for i in ${BUILD_CPPFLAGS}; do
69 > > fs+=( "--host_conlyopt=${i}" "--host_cxxopt=${i}" )
70 > > done
71 > > for i in ${LDFLAGS}; do
72 > > fs+=( "--linkopt=${i}" )
73 > > done
74 > > for i in ${BUILD_LDFLAGS}; do
75 > > fs+=( "--host_linkopt=${i}" )
76 > > done
77 > > echo "${fs[*]}"
78 > > }
79 > >
80 > > # @FUNCTION: bazel_setup_bazelrc
81 > > # @DESCRIPTION:
82 > > # Creates the bazelrc with common options that will be passed
83 > > # to bazel. This will be called by ebazel automatically so
84 > > # does not need to be called from the ebuild.
85 > > bazel_setup_bazelrc() {
86 > > if [[ -f "${T}/bazelrc" ]]; then
87 > > return
88 > > fi
89 > >
90 > > # F: fopen_wr
91 > > # P: /proc/self/setgroups
92 > > # Even with standalone enabled, the Bazel sandbox binary is run for feature test:
93 > > # https://github.com/bazelbuild/bazel/blob/7b091c1397a82258e26ab5336df6c8dae1d97384/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java#L61
94 > > # https://github.com/bazelbuild/bazel/blob/76555482873ffcf1d32fb40106f89231b37f850a/src/main/tools/linux-sandbox-pid1.cc#L113
95 > > addpredict /proc
96 > >
97 > > mkdir -p "${T}/bazel-cache" || die
98 > > mkdir -p "${T}/bazel-distdir" || die
99 > >
100 > > cat > "${T}/bazelrc" <<-EOF || die
101 > > startup --batch
102 >
103 > Maybe indent this stuff to make it stand out from ebuild code.
104 >
105 > >
106 > > # dont strip HOME, portage sets a temp per-package dir
107 > > build --action_env HOME
108 > >
109 > > # make bazel respect MAKEOPTS
110 > > build --jobs=$(makeopts_jobs)
111 > > build --compilation_mode=opt --host_compilation_mode=opt
112 > >
113 > > # FLAGS
114 > > build $(bazel_get_flags)
115 > >
116 > > # Use standalone strategy to deactivate the bazel sandbox, since it
117 > > # conflicts with FEATURES=sandbox.
118 > > build --spawn_strategy=standalone --genrule_strategy=standalone
119 > > test --spawn_strategy=standalone --genrule_strategy=standalone
120 > >
121 > > build --strip=never
122 > > build --verbose_failures --noshow_loading_progress
123 > > test --verbose_test_summary --verbose_failures --noshow_loading_progress
124 > >
125 > > # make bazel only fetch distfiles from the cache
126 > > fetch --repository_cache="${T}/bazel-cache/" --distdir="${T}/bazel-distdir/"
127 > > build --repository_cache="${T}/bazel-cache/" --distdir="${T}/bazel-distdir/"
128 > >
129 > > build --define=PREFIX=${EPREFIX%/}/usr
130 > > build --define=LIBDIR=\$(PREFIX)/$(get_libdir)
131 > >
132 > > EOF
133 > >
134 > > tc-is-cross-compiler || \
135 > > echo "build --nodistinct_host_configuration" >> "${T}/bazelrc" || die
136 >
137 > Don't do || chains, they are unreadable.
138 ok
139
140 > > }
141 > >
142 > > # @FUNCTION: ebazel
143 > > # @USAGE: [<args>...]
144 > > # @DESCRIPTION:
145 > > # Run bazel with the bazelrc and output_base.
146 > > #
147 > > # If $MULTIBUILD_VARIANT is set, this will make an output_base
148 > > # specific to that variant.
149 > > # bazel_setup_bazelrc will be called and the created bazelrc
150 > > # will be passed to bazel.
151 > > #
152 > > # Will automatically die if bazel does not exit cleanly.
153 > > ebazel() {
154 > > bazel_setup_bazelrc
155 > >
156 > > # Use different build folders for each multibuild variant.
157 > > local base_suffix="${MULTIBUILD_VARIANT+-}${MULTIBUILD_VARIANT}"
158 >
159 > Any reason not to use BUILD_DIR instead of reinventing it?
160
161 Isnt $BUILD_DIR $S by default? output_base is a bazel thing with a weird
162 structure and has nothing to do with the source tree. Doing it this way
163 meant python_foreach_impl was easy. but I might be confused and will
164 look into it again and see if it can be better. this way means with and
165 without multibuild just works tho.
166
167 > > local output_base="${WORKDIR}/bazel-base${base_suffix}"
168 > > mkdir -p "${output_base}" || die
169 > >
170 > > einfo Running: bazel --output_base="${output_base}" "$@"
171 > > bazel --bazelrc="${T}/bazelrc" --output_base="${output_base}" $@ || die "ebazel $@"
172 >
173 > The common practice is to echo >&2 it. Also, you output different
174 > arguments than you execute which is going to confuse the hell out of
175 > users who'll end up having to debug this. You can use a trick like
176 > the following to avoid typing args twice:
177 >
178 > set -- bazel --bazelrc...
179 > echo "${*}" >&2
180 > "${@}" || die ...
181
182 Oh, forgot that trick. yeah I'll do that.
183
184 > > }
185 > >
186 > > # @FUNCTION: bazel_load_distfiles
187 > > # @USAGE: <distfiles>...
188 > > # @DESCRIPTION:
189 > > # Populate the bazel distdir to fetch from since it cannot use
190 > > # the network. Bazel looks in distdir but will only look for the
191 > > # original filename, not the possibly renamed one that portage
192 > > # downloaded. If the line has -> we to rename it back. This also
193 > > # handles use-conditionals that SRC_URI does.
194 >
195 > Why oh why do you have to implement custom parser for the ebuild syntax?
196 > That's just asking for horrible failures.
197
198 Yeah ... so thats the problem with bazel and the main reason for the
199 eclass. Bazel has this idea that downloading and unpacking random
200 tarballs during build is okay. Now bazel has an option when it needs to
201 fetch something it will look in the dir --distdir= points to instead and
202 read the file instead of trying to fetch from the (sandboxed thus
203 non-existent) internet. But bazel only knows the original filenames not
204 the names that portage renamed things to so I need to rename them back
205 to the original names.
206
207 The other option would be to have ebuilds maintain a second list of urls
208 on top of the SRC_URI one which seems way more of a maintenance burden,
209 especially when use-flags are involved. So all I really do here is strip
210 the use? and ()'s and take the list of urls and put them in the
211 bazel-distdir with the original filenames.
212
213 If you can think of a better way to accomplish that I'd love to hear it
214 tho. I'll add more comments about what it does but bazel is weird and
215 just utterly fails if it cant read the tarballs. And unpacking things
216 myself doesnt work because of the weird file structure bazel uses so
217 knowing where to put them is non-trivial.
218
219 > > #
220 > > # Example:
221 > > # @CODE
222 > > # bazel_external_uris="http://a/file-2.0.tgz
223 > > # python? ( http://b/1.0.tgz -> foo-1.0.tgz )"
224 > > # SRC_URI="http://c/${PV}.tgz
225 > > # ${bazel_external_uris}"
226 > > #
227 > > # src_unpack() {
228 > > # unpack ${PV}.tgz
229 > > # bazel_load_distfiles "${bazel_external_uris}"
230 > > # }
231 > > # @CODE
232 > > bazel_load_distfiles() {
233 > > local src dst uri rename
234 > >
235 > > [[ "$@" ]] || die "Missing args"
236 > > mkdir -p "${T}/bazel-distdir" || die
237 > >
238 > > while read uri rename dst; do
239 > > src="${uri##*/}"
240 > > [[ -z $src ]] && continue
241 >
242 > Please use ${foo} syntax in ebuilds, consistently.
243 ok
244
245 > > if [[ "$rename" != "->" ]]; then
246 > > dst="${src}"
247 > > fi
248 > >
249 > > [[ ${A} =~ ${dst} ]] || continue
250 >
251 > Why are you doing regex match here? Last I checked, we didn't use
252 > regular expressions in SRC_URI.
253
254 It was to skip this one if its not in $A (eg disabled with a use-flag)
255 I'll change it to == *${dst}* instead then.
256
257 > >
258 > > if [[ "$dst" == "$src" ]]; then
259 > > einfo "Copying $dst to bazel distdir ..."
260 > > else
261 > > einfo "Copying $dst to bazel distdir $src ..."
262 >
263 > Are you using src and dst to mean the opposite?
264
265 No, this is right. its SRC_URI="src -> dst" so I need to rename dst back
266 so its named src.
267
268 > > fi
269 > > dst="$(readlink -f "${DISTDIR}/${dst}")"
270 >
271 > Looks like you are hardcoding hacks for implementation details which
272 > indicates whatever you're doing is a very bad idea, and is going to fail
273 > whenever the implementation is subtly different than what you've worked
274 > around so far.
275
276 I can skip this line then no problem. It will just end up a link to a
277 link but thats fine.
278
279 > > ln -s "${dst}" "${T}/bazel-distdir/${src}" || die
280 > > done <<< "$(sed -re 's/!?[A-Za-z]+\?\s+\(\s*//g; s/\s+\)//g' <<< "$@")"
281 >
282 > Please don't use horribly unreadable sed expressions. This just means
283 > that whoever will have to touch this eclass in the future will wish you
284 > were never recruited.
285
286 I will split it up and add more comments. It just removes "use? (",
287 "!use? (", and " )". If I don't remove the useflag bits first then figuring out
288 if its a rename or not becomes a lot more complicated. If you have
289 other ideas im all ears.
290
291
292 >
293 > > }
294 > >
295 > > _BAZEL_ECLASS=1
296 > > fi
297 > >
298 > >
299 > >
300 >
301 > --
302 > Best regards,
303 > Michał Górny

Replies

Subject Author
Re: [gentoo-dev] Bazel Build eclass "Michał Górny" <mgorny@g.o>