Gentoo Archives: gentoo-dev

From: Jason Zaman <perfinion@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Bazel Build eclass
Date: Wed, 19 Dec 2018 15:31:55
Message-Id: 20181219153142.GA56005@baraddur.perfinion.com
In Reply to: Re: [gentoo-dev] Bazel Build eclass by "Michał Górny"
1 On Sun, Nov 18, 2018 at 09:01:00AM +0100, Michał Górny wrote:
2 > On Sun, 2018-11-18 at 15:37 +0800, Jason Zaman wrote:
3 > > On Sat, Nov 17, 2018 at 11:54:24PM +0100, Michał Górny wrote:
4 > > > On Sun, 2018-11-18 at 03:37 +0800, Jason Zaman wrote:
5 > > > > Hey all,
6 > > > >
7 > > > > I've been using Bazel (https://bazel.build/) to build TensorFlow for a
8 > > > > while now. Here is a bazel.eclass I'd like to commit to make it easier
9 > > > > for packages that use it to build. It's basically bits that I've
10 > > > > refactored out of the TensorFlow ebuild that would be useful to other
11 > > > > packages as well. I have a bump to sci-libs/tensorflow-1.12.0 prepared
12 > > > > that uses this eclass and have tested a full install.
13 > > > >
14 > > > > -- Jason
15
16 Here is a v2 of the eclass for review:
17
18 Major changes:
19 - removed MULTIBUILD_VARIANT in favour of BUILD_DIR
20 - changed bazel_load_distfiles() to tokenize and skip items instead of a
21 giant sed command.
22
23
24 # Copyright 1999-2018 Jason Zaman
25 # Distributed under the terms of the GNU General Public License v2
26
27 # @ECLASS: bazel.eclass
28 # @MAINTAINER:
29 # Jason Zaman <perfinion@g.o>
30 # @AUTHOR:
31 # Jason Zaman <perfinion@g.o>
32 # @BLURB: Utility functions for packages using Bazel Build
33 # @DESCRIPTION:
34 # A utility eclass providing functions to run the Bazel Build system.
35 #
36 # This eclass does not export any phase functions.
37
38 case "${EAPI:-0}" in
39 0|1|2|3|4|5|6)
40 die "Unsupported EAPI=${EAPI:-0} (too old) for ${ECLASS}"
41 ;;
42 7)
43 ;;
44 *)
45 die "Unsupported EAPI=${EAPI} (unknown) for ${ECLASS}"
46 ;;
47 esac
48
49 if [[ ! ${_BAZEL_ECLASS} ]]; then
50
51 inherit multiprocessing toolchain-funcs
52
53 BDEPEND=">=dev-util/bazel-0.19"
54
55 # @FUNCTION: bazel_get_flags
56 # @DESCRIPTION:
57 # Obtain and print the bazel flags for target and host *FLAGS.
58 #
59 # To add more flags to this, append the flags to the
60 # appropriate variable before calling this function
61 bazel_get_flags() {
62 local i fs=()
63 for i in ${CFLAGS}; do
64 fs+=( "--conlyopt=${i}" )
65 done
66 for i in ${BUILD_CFLAGS}; do
67 fs+=( "--host_conlyopt=${i}" )
68 done
69 for i in ${CXXFLAGS}; do
70 fs+=( "--cxxopt=${i}" )
71 done
72 for i in ${BUILD_CXXFLAGS}; do
73 fs+=( "--host_cxxopt=${i}" )
74 done
75 for i in ${CPPFLAGS}; do
76 fs+=( "--conlyopt=${i}" "--cxxopt=${i}" )
77 done
78 for i in ${BUILD_CPPFLAGS}; do
79 fs+=( "--host_conlyopt=${i}" "--host_cxxopt=${i}" )
80 done
81 for i in ${LDFLAGS}; do
82 fs+=( "--linkopt=${i}" )
83 done
84 for i in ${BUILD_LDFLAGS}; do
85 fs+=( "--host_linkopt=${i}" )
86 done
87 echo "${fs[*]}"
88 }
89
90 # @FUNCTION: bazel_setup_bazelrc
91 # @DESCRIPTION:
92 # Creates the bazelrc with common options that will be passed
93 # to bazel. This will be called by ebazel automatically so
94 # does not need to be called from the ebuild.
95 bazel_setup_bazelrc() {
96 if [[ -f "${T}/bazelrc" ]]; then
97 return
98 fi
99
100 # F: fopen_wr
101 # P: /proc/self/setgroups
102 # Even with standalone enabled, the Bazel sandbox binary is run for feature test:
103 # https://github.com/bazelbuild/bazel/blob/7b091c1397a82258e26ab5336df6c8dae1d97384/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java#L61
104 # https://github.com/bazelbuild/bazel/blob/76555482873ffcf1d32fb40106f89231b37f850a/src/main/tools/linux-sandbox-pid1.cc#L113
105 addpredict /proc
106
107 mkdir -p "${T}/bazel-cache" || die
108 mkdir -p "${T}/bazel-distdir" || die
109
110 cat > "${T}/bazelrc" <<-EOF || die
111 startup --batch
112
113 # dont strip HOME, portage sets a temp per-package dir
114 build --action_env HOME
115
116 # make bazel respect MAKEOPTS
117 build --jobs=$(makeopts_jobs)
118 build --compilation_mode=opt --host_compilation_mode=opt
119
120 # FLAGS
121 build $(bazel_get_flags)
122
123 # Use standalone strategy to deactivate the bazel sandbox, since it
124 # conflicts with FEATURES=sandbox.
125 build --spawn_strategy=standalone --genrule_strategy=standalone
126 test --spawn_strategy=standalone --genrule_strategy=standalone
127
128 build --strip=never
129 build --verbose_failures --noshow_loading_progress
130 test --verbose_test_summary --verbose_failures --noshow_loading_progress
131
132 # make bazel only fetch distfiles from the cache
133 fetch --repository_cache="${T}/bazel-cache/" --distdir="${T}/bazel-distdir/"
134 build --repository_cache="${T}/bazel-cache/" --distdir="${T}/bazel-distdir/"
135
136 build --define=PREFIX=${EPREFIX%/}/usr
137 build --define=LIBDIR=\$(PREFIX)/$(get_libdir)
138 EOF
139
140 if tc-is-cross-compiler; then
141 echo "build --nodistinct_host_configuration" >> "${T}/bazelrc" || die
142 fi
143 }
144
145 # @FUNCTION: ebazel
146 # @USAGE: [<args>...]
147 # @DESCRIPTION:
148 # Run bazel with the bazelrc and output_base.
149 #
150 # output_base will be specific to $BUILD_DIR (if unset, $S).
151 # bazel_setup_bazelrc will be called and the created bazelrc
152 # will be passed to bazel.
153 #
154 # Will automatically die if bazel does not exit cleanly.
155 ebazel() {
156 bazel_setup_bazelrc
157
158 # Use different build folders for each multibuild variant.
159 local output_base="${BUILD_DIR:-${S}}"
160 output_base="${output_base%/}-bazel-base"
161 mkdir -p "${output_base}" || die
162
163 set -- bazel --bazelrc="${T}/bazelrc" --output_base="${output_base}" ${@}
164 echo "${*}" >&2
165 "${@}" || die "ebazel failed"
166 }
167
168 # @FUNCTION: bazel_load_distfiles
169 # @USAGE: <distfiles>...
170 # @DESCRIPTION:
171 # Populate the bazel distdir to fetch from since it cannot use
172 # the network. Bazel looks in distdir but will only look for the
173 # original filename, not the possibly renamed one that portage
174 # downloaded. If the line has -> we to rename it back. This also
175 # handles use-conditionals that SRC_URI does.
176 #
177 # Example:
178 # @CODE
179 # bazel_external_uris="http://a/file-2.0.tgz
180 # python? ( http://b/1.0.tgz -> foo-1.0.tgz )"
181 # SRC_URI="http://c/${PV}.tgz
182 # ${bazel_external_uris}"
183 #
184 # src_unpack() {
185 # unpack ${PV}.tgz
186 # bazel_load_distfiles "${bazel_external_uris}"
187 # }
188 # @CODE
189 bazel_load_distfiles() {
190 local file=""
191 local rename=0
192
193 [[ "${@}" ]] || die "Missing args"
194 mkdir -p "${T}/bazel-distdir" || die
195
196 for word in ${@}
197 do
198 if [[ "${word}" == "->" ]]; then
199 # next word is a dest filename
200 rename=1
201 elif [[ "${word}" == ")" ]]; then
202 # close conditional block
203 continue
204 elif [[ "${word}" == "(" ]]; then
205 # open conditional block
206 continue
207 elif [[ "${word}" == ?(\!)[A-Za-z0-9]*([A-Za-z0-9+_@-])\? ]]; then
208 # use-conditional block
209 # USE-flags can contain [A-Za-z0-9+_@-], and start with alphanum
210 # https://dev.gentoo.org/~ulm/pms/head/pms.html#x1-200003.1.4
211 # ?(\!) matches zero-or-one !'s
212 # *(...) zero-or-more characters
213 # ends with a ?
214 continue
215 elif [[ ${rename} -eq 1 ]]; then
216 # Make sure the distfile is used
217 if [[ "${A}" == *"${word}"* ]]; then
218 echo "Copying ${file} to bazel distdir as ${word}"
219 ln -s "${DISTDIR}/${word}" "${T}/bazel-distdir/${file}" || die
220 fi
221 rename=0
222 file=""
223 else
224 # another URL, current one may or may not be a rename
225 # if there was a previous one, its not renamed so copy it now
226 if [[ -n "${file}" && "${A}" == *"${file}"* ]]; then
227 echo "Copying ${file} to bazel distdir"
228 ln -s "${DISTDIR}/${file}" "${T}/bazel-distdir/${file}" || die
229 fi
230 # save the current URL, later we will find out if its a rename or not.
231 file="${word##*/}"
232 fi
233 done
234
235 # handle last file
236 if [[ -n "${file}" ]]; then
237 echo "Copying ${file} to bazel distdir"
238 ln -s "${DISTDIR}/${file}" "${T}/bazel-distdir/${file}" || die
239 fi
240 }
241
242 _BAZEL_ECLASS=1
243 fi
244
245
246
247
248
249
250
251 > > > >
252 > > > > # Copyright 1999-2018 Jason Zaman
253 > > > > # Distributed under the terms of the GNU General Public License v2
254 > > > >
255 > > > > # @ECLASS: bazel.eclass
256 > > > > # @MAINTAINER:
257 > > > > # Jason Zaman <perfinion@g.o>
258 > > > > # @AUTHOR:
259 > > > > # Jason Zaman <perfinion@g.o>
260 > > > > # @BLURB: Utility functions for packages using Bazel Build
261 > > > > # @DESCRIPTION:
262 > > > > # A utility eclass providing functions to run the Bazel Build system.
263 > > > > #
264 > > > > # This eclass does not export any phase functions.
265 > > > >
266 > > > > case "${EAPI:-0}" in
267 > > > > 0|1|2|3|4|5|6)
268 > > > > die "Unsupported EAPI=${EAPI:-0} (too old) for ${ECLASS}"
269 > > > > ;;
270 > > > > 7)
271 > > > > ;;
272 > > > > *)
273 > > > > die "Unsupported EAPI=${EAPI} (unknown) for ${ECLASS}"
274 > > > > ;;
275 > > > > esac
276 > > > >
277 > > > > if [[ ! ${_BAZEL_ECLASS} ]]; then
278 > > > >
279 > > > > inherit multiprocessing toolchain-funcs
280 > > > >
281 > > > > BDEPEND=">=dev-util/bazel-0.19"
282 > > > >
283 > > > > # @FUNCTION: bazel_get_flags
284 > > > > # @DESCRIPTION:
285 > > > > # Obtain and print the bazel flags for target and host *FLAGS.
286 > > > > #
287 > > > > # To add more flags to this, append the flags to the
288 > > > > # appropriate variable before calling this function
289 > > > > bazel_get_flags() {
290 > > > > local i fs=()
291 > > > > for i in ${CFLAGS}; do
292 > > > > fs+=( "--conlyopt=${i}" )
293 > > > > done
294 > > > > for i in ${BUILD_CFLAGS}; do
295 > > > > fs+=( "--host_conlyopt=${i}" )
296 > > > > done
297 > > > > for i in ${CXXFLAGS}; do
298 > > > > fs+=( "--cxxopt=${i}" )
299 > > > > done
300 > > > > for i in ${BUILD_CXXFLAGS}; do
301 > > > > fs+=( "--host_cxxopt=${i}" )
302 > > > > done
303 > > > > for i in ${CPPFLAGS}; do
304 > > > > fs+=( "--conlyopt=${i}" "--cxxopt=${i}" )
305 > > > > done
306 > > > > for i in ${BUILD_CPPFLAGS}; do
307 > > > > fs+=( "--host_conlyopt=${i}" "--host_cxxopt=${i}" )
308 > > > > done
309 > > > > for i in ${LDFLAGS}; do
310 > > > > fs+=( "--linkopt=${i}" )
311 > > > > done
312 > > > > for i in ${BUILD_LDFLAGS}; do
313 > > > > fs+=( "--host_linkopt=${i}" )
314 > > > > done
315 > > > > echo "${fs[*]}"
316 > > > > }
317 > > > >
318 > > > > # @FUNCTION: bazel_setup_bazelrc
319 > > > > # @DESCRIPTION:
320 > > > > # Creates the bazelrc with common options that will be passed
321 > > > > # to bazel. This will be called by ebazel automatically so
322 > > > > # does not need to be called from the ebuild.
323 > > > > bazel_setup_bazelrc() {
324 > > > > if [[ -f "${T}/bazelrc" ]]; then
325 > > > > return
326 > > > > fi
327 > > > >
328 > > > > # F: fopen_wr
329 > > > > # P: /proc/self/setgroups
330 > > > > # Even with standalone enabled, the Bazel sandbox binary is run for feature test:
331 > > > > # https://github.com/bazelbuild/bazel/blob/7b091c1397a82258e26ab5336df6c8dae1d97384/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java#L61
332 > > > > # https://github.com/bazelbuild/bazel/blob/76555482873ffcf1d32fb40106f89231b37f850a/src/main/tools/linux-sandbox-pid1.cc#L113
333 > > > > addpredict /proc
334 > > > >
335 > > > > mkdir -p "${T}/bazel-cache" || die
336 > > > > mkdir -p "${T}/bazel-distdir" || die
337 > > > >
338 > > > > cat > "${T}/bazelrc" <<-EOF || die
339 > > > > startup --batch
340 > > >
341 > > > Maybe indent this stuff to make it stand out from ebuild code.
342 > > >
343 > > > >
344 > > > > # dont strip HOME, portage sets a temp per-package dir
345 > > > > build --action_env HOME
346 > > > >
347 > > > > # make bazel respect MAKEOPTS
348 > > > > build --jobs=$(makeopts_jobs)
349 > > > > build --compilation_mode=opt --host_compilation_mode=opt
350 > > > >
351 > > > > # FLAGS
352 > > > > build $(bazel_get_flags)
353 > > > >
354 > > > > # Use standalone strategy to deactivate the bazel sandbox, since it
355 > > > > # conflicts with FEATURES=sandbox.
356 > > > > build --spawn_strategy=standalone --genrule_strategy=standalone
357 > > > > test --spawn_strategy=standalone --genrule_strategy=standalone
358 > > > >
359 > > > > build --strip=never
360 > > > > build --verbose_failures --noshow_loading_progress
361 > > > > test --verbose_test_summary --verbose_failures --noshow_loading_progress
362 > > > >
363 > > > > # make bazel only fetch distfiles from the cache
364 > > > > fetch --repository_cache="${T}/bazel-cache/" --distdir="${T}/bazel-distdir/"
365 > > > > build --repository_cache="${T}/bazel-cache/" --distdir="${T}/bazel-distdir/"
366 > > > >
367 > > > > build --define=PREFIX=${EPREFIX%/}/usr
368 > > > > build --define=LIBDIR=\$(PREFIX)/$(get_libdir)
369 > > > >
370 > > > > EOF
371 > > > >
372 > > > > tc-is-cross-compiler || \
373 > > > > echo "build --nodistinct_host_configuration" >> "${T}/bazelrc" || die
374 > > >
375 > > > Don't do || chains, they are unreadable.
376 > >
377 > > ok
378 > >
379 > > > > }
380 > > > >
381 > > > > # @FUNCTION: ebazel
382 > > > > # @USAGE: [<args>...]
383 > > > > # @DESCRIPTION:
384 > > > > # Run bazel with the bazelrc and output_base.
385 > > > > #
386 > > > > # If $MULTIBUILD_VARIANT is set, this will make an output_base
387 > > > > # specific to that variant.
388 > > > > # bazel_setup_bazelrc will be called and the created bazelrc
389 > > > > # will be passed to bazel.
390 > > > > #
391 > > > > # Will automatically die if bazel does not exit cleanly.
392 > > > > ebazel() {
393 > > > > bazel_setup_bazelrc
394 > > > >
395 > > > > # Use different build folders for each multibuild variant.
396 > > > > local base_suffix="${MULTIBUILD_VARIANT+-}${MULTIBUILD_VARIANT}"
397 > > >
398 > > > Any reason not to use BUILD_DIR instead of reinventing it?
399 > >
400 > > Isnt $BUILD_DIR $S by default? output_base is a bazel thing with a weird
401 > > structure and has nothing to do with the source tree. Doing it this way
402 > > meant python_foreach_impl was easy. but I might be confused and will
403 > > look into it again and see if it can be better. this way means with and
404 > > without multibuild just works tho.
405 >
406 > There's no default. It's defined by the few eclasses such as multibuild
407 > and cmake-utils. You just need to default if it other eclasses don't
408 > set it.
409 >
410 > >
411 > > > > local output_base="${WORKDIR}/bazel-base${base_suffix}"
412 > > > > mkdir -p "${output_base}" || die
413 > > > >
414 > > > > einfo Running: bazel --output_base="${output_base}" "$@"
415 > > > > bazel --bazelrc="${T}/bazelrc" --output_base="${output_base}" $@ || die "ebazel $@"
416 > > >
417 > > > The common practice is to echo >&2 it. Also, you output different
418 > > > arguments than you execute which is going to confuse the hell out of
419 > > > users who'll end up having to debug this. You can use a trick like
420 > > > the following to avoid typing args twice:
421 > > >
422 > > > set -- bazel --bazelrc...
423 > > > echo "${*}" >&2
424 > > > "${@}" || die ...
425 > >
426 > > Oh, forgot that trick. yeah I'll do that.
427 > >
428 > > > > }
429 > > > >
430 > > > > # @FUNCTION: bazel_load_distfiles
431 > > > > # @USAGE: <distfiles>...
432 > > > > # @DESCRIPTION:
433 > > > > # Populate the bazel distdir to fetch from since it cannot use
434 > > > > # the network. Bazel looks in distdir but will only look for the
435 > > > > # original filename, not the possibly renamed one that portage
436 > > > > # downloaded. If the line has -> we to rename it back. This also
437 > > > > # handles use-conditionals that SRC_URI does.
438 > > >
439 > > > Why oh why do you have to implement custom parser for the ebuild syntax?
440 > > > That's just asking for horrible failures.
441 > >
442 > > Yeah ... so thats the problem with bazel and the main reason for the
443 > > eclass. Bazel has this idea that downloading and unpacking random
444 > > tarballs during build is okay. Now bazel has an option when it needs to
445 > > fetch something it will look in the dir --distdir= points to instead and
446 > > read the file instead of trying to fetch from the (sandboxed thus
447 > > non-existent) internet. But bazel only knows the original filenames not
448 > > the names that portage renamed things to so I need to rename them back
449 > > to the original names.
450 > >
451 > > The other option would be to have ebuilds maintain a second list of urls
452 > > on top of the SRC_URI one which seems way more of a maintenance burden,
453 > > especially when use-flags are involved. So all I really do here is strip
454 > > the use? and ()'s and take the list of urls and put them in the
455 > > bazel-distdir with the original filenames.
456 > >
457 > > If you can think of a better way to accomplish that I'd love to hear it
458 > > tho. I'll add more comments about what it does but bazel is weird and
459 > > just utterly fails if it cant read the tarballs. And unpacking things
460 > > myself doesnt work because of the weird file structure bazel uses so
461 > > knowing where to put them is non-trivial.
462 >
463 > What if the original URL goes down and somebody replaces 'foo -> bar'
464 > with 'mirror://gentoo/bar'?
465 >
466 > >
467 > > > > #
468 > > > > # Example:
469 > > > > # @CODE
470 > > > > # bazel_external_uris="http://a/file-2.0.tgz
471 > > > > # python? ( http://b/1.0.tgz -> foo-1.0.tgz )"
472 > > > > # SRC_URI="http://c/${PV}.tgz
473 > > > > # ${bazel_external_uris}"
474 > > > > #
475 > > > > # src_unpack() {
476 > > > > # unpack ${PV}.tgz
477 > > > > # bazel_load_distfiles "${bazel_external_uris}"
478 > > > > # }
479 > > > > # @CODE
480 > > > > bazel_load_distfiles() {
481 > > > > local src dst uri rename
482 > > > >
483 > > > > [[ "$@" ]] || die "Missing args"
484 > > > > mkdir -p "${T}/bazel-distdir" || die
485 > > > >
486 > > > > while read uri rename dst; do
487 > > > > src="${uri##*/}"
488 > > > > [[ -z $src ]] && continue
489 > > >
490 > > > Please use ${foo} syntax in ebuilds, consistently.
491 > >
492 > > ok
493 > >
494 > > > > if [[ "$rename" != "->" ]]; then
495 > > > > dst="${src}"
496 > > > > fi
497 > > > >
498 > > > > [[ ${A} =~ ${dst} ]] || continue
499 > > >
500 > > > Why are you doing regex match here? Last I checked, we didn't use
501 > > > regular expressions in SRC_URI.
502 > >
503 > > It was to skip this one if its not in $A (eg disabled with a use-flag)
504 > > I'll change it to == *${dst}* instead then.
505 > >
506 > > > >
507 > > > > if [[ "$dst" == "$src" ]]; then
508 > > > > einfo "Copying $dst to bazel distdir ..."
509 > > > > else
510 > > > > einfo "Copying $dst to bazel distdir $src ..."
511 > > >
512 > > > Are you using src and dst to mean the opposite?
513 > >
514 > > No, this is right. its SRC_URI="src -> dst" so I need to rename dst back
515 > > so its named src.
516 > >
517 > > > > fi
518 > > > > dst="$(readlink -f "${DISTDIR}/${dst}")"
519 > > >
520 > > > Looks like you are hardcoding hacks for implementation details which
521 > > > indicates whatever you're doing is a very bad idea, and is going to fail
522 > > > whenever the implementation is subtly different than what you've worked
523 > > > around so far.
524 > >
525 > > I can skip this line then no problem. It will just end up a link to a
526 > > link but thats fine.
527 > >
528 > > > > ln -s "${dst}" "${T}/bazel-distdir/${src}" || die
529 > > > > done <<< "$(sed -re 's/!?[A-Za-z]+\?\s+\(\s*//g; s/\s+\)//g' <<< "$@")"
530 > > >
531 > > > Please don't use horribly unreadable sed expressions. This just means
532 > > > that whoever will have to touch this eclass in the future will wish you
533 > > > were never recruited.
534 > >
535 > > I will split it up and add more comments. It just removes "use? (",
536 > > "!use? (", and " )". If I don't remove the useflag bits first then figuring out
537 > > if its a rename or not becomes a lot more complicated. If you have
538 > > other ideas im all ears.
539 >
540 > The syntax is token-oriented, so just tokenize it and use a state
541 > machine to parse it if you must.
542 >
543 > >
544 > >
545 > > >
546 > > > > }
547 > > > >
548 > > > > _BAZEL_ECLASS=1
549 > > > > fi
550 > > > >
551 > > > >
552 > > > >
553 > > >
554 > > > --
555 > > > Best regards,
556 > > > Michał Górny
557 > >
558 > >
559 > >
560 >
561 > --
562 > Best regards,
563 > Michał Górny