* Re: [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?)
2024-12-03 15:32 [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?) Michał Górny
@ 2024-12-03 16:29 ` Gerion Entrup
2024-12-04 1:13 ` Matt Jolly
2024-12-04 13:12 ` James Le Cuirot
2 siblings, 0 replies; 4+ messages in thread
From: Gerion Entrup @ 2024-12-03 16:29 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 4870 bytes --]
Hi,
this is not from a Gentoo packaging perspective but from a developer
perspective that needs to compile with and against LLVM on several
distributions. For building a package against LLVM, LLVM offers two
possibilities: llvm-config and their CMake files. I only have experience
with the first one. LLVM furthermore does not offer a consistent way to
install their binaries (including llvm-config) in different versions, so
most Linux distributions does that in a different way.
Therefore, we needed to find a way to tell the build system itself,
how to manage this. In a concrete example, we are doing this for ARA/PARROT
here [1] (see the native-<distribution>.ini files). Meson is made aware
of the path to llvm-config with a distribution specific config file.
Then Meson discovers other binaries with the help of llvm-config [2, 3].
Overall, this works well but needs work per distribution.
Here is a Meson bug that I created to classify the solutions of
different Linux distributions [4]. I also created a Gentoo bug for it
some time ago (were you recommended for an upstream fix) [5]. Here is the
old LLVM bug for the same problem (I do not know if they transferred
it to Github) [6].
I have no clear solution to the problem. I wish that LLVM itself would
create versioned symlinks to all of their binary tools that distribution
could install in /usr/bin and build systems can use to find specific
versions of LLVM libraries.
Kind regards
Gerion
[1] https://github.com/luhsra/PARROT
[2] https://github.com/luhsra/ara-toolchains/blob/9a3570017a8a61cf078ed5142d272ed279f8d112/meson.build#L10
[3] https://mesonbuild.com/Dependencies.html#llvm
[4] https://github.com/mesonbuild/meson/issues/5370
[5] https://bugs.gentoo.org/677504
[5] https://bugs.llvm.org/show_bug.cgi?id=41794
Am Dienstag, 3. Dezember 2024, 16:32:45 MEZ schrieb Michał Górny:
> Hello,
>
> TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
> and I'd like to replace that with something better (possibly in llvm-
> r2.eclass, given how fragile this thing is). So I'd like to discuss
> potential "better" solutions -- and particularly ask you what your LLVM-
> using packages need.
>
>
> Background
> ==========
>
> The current logic goes way back to llvm.eclass, and EAPIs that did not
> have native cross-build support. Back then, prepending the slotted LLVM
> bindir to PATH was the obvious way of getting software to find the right
> LLVM version.
>
> When I added EAPI 7 support, I went for prepending the following thing
> to PATH:
>
> ${ESYSROOT}/usr/lib/llvm/.../bin
>
> People doing cross will clearly notice the mistake here -- it's using
> binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
> but an ugly hack. What we're doing here is:
>
> 1. Relying on a fancy CMake behavior of locating CMake files relative to
> PATH, and
>
> 2. Relying on the package either not caring about LLVM executables or
> the system not being able to execute the executables in ESYSROOT
> and gracefully falling back to other locations in PATH.
>
> So what we're really doing is implicitly telling CMake to use:
>
> ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake
>
> Yes, it's awful. And yes, it already did backfire in the past, so I've
> ended up adding quite a complex logic to prevent these path
> manipulations from overriding the toolchain set by user. For example,
> if the user has CC=clang, that normally evalutes to clang-19, we now
> adjust CC so that it suddenly doesn't switch to clang-17 because
> the package uses libLLVM-17. Meh.
>
> When working on llvm-r1, I've focused on the more immediate problem of
> horribly complex and broken package dependencies, and forgot about this.
> I've only recalled the problem during the initial rust.eclass reviews,
> since it happened to copy that incorrect logic.
>
>
> Future options
> ==============
>
> Some of the options that already popped up during discussions include:
>
> 1. Stopping to export pkg_setup() entirely, and expecting people to
> explicitly pass the LLVM path to the build system, e.g. something like:
>
> -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"
>
> 2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
> and so on for CMake, or perhaps CMAKE_PREFIX_PATH).
>
> 3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
> ${PATH} instead of the whole LLVM tree. Note that we'd need to write
> our own since llvm-config is an executable, so we can't run the one from
> ESYSROOT, and we can't rely on BROOT having a match (or don't want to
> force a second copy of LLVM unnecessarily).
>
> Any other ideas? How does your package select LLVM version, and which
> of these options would work best for you?
>
>
>
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 659 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?)
2024-12-03 15:32 [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?) Michał Górny
2024-12-03 16:29 ` Gerion Entrup
@ 2024-12-04 1:13 ` Matt Jolly
2024-12-04 13:12 ` James Le Cuirot
2 siblings, 0 replies; 4+ messages in thread
From: Matt Jolly @ 2024-12-04 1:13 UTC (permalink / raw
To: gentoo-dev
Hi Michał,
I use llvm-r1 in a few packages, and for the intended purpose of
consistently selecting and depending on a specific LLVM I've had
no major issues. Overall things work well, and the addition of
LLVM_SLOT USE_EXPAND for -r1 has made influencing the selection
as an end-user (and developer) so much more straightforward.
I don't think that the Rust eclass could work properly without
llvm-r1 given how tightly coupled dev-lang/rust is its vendored
LLVM version and the issues that we've encountered mixing those.
I'm not opposed to any of the options you've presented; they seem
reasonable and an improvement over the current situation.
At a high level:
- option 1: seems to put a lot of the burden on package maintainers to
ensure that their build system is set up to support this and may
require upstream changes.
- option 2: Seems "fine" for CMake based projects, but I have concerns
about how other build systems will be catered to; is this something
you could elaborate on - how might a non-CMake build system consume
more generic variables? Are these widely used/supported and I'm just
unaware of it?
- option 3: Seems quite straightforward, and I can see this being quite
flexible in terms of being called within an ebuild if necessary
(though consuming LLVM_SLOT might get ebuilds most of the way there?).
Overall perhaps some combination of options 2 and 3 might be the easiest
thing for eclass consumers to use flexibly at the cost of additional
eclass complexity. I'm interested in how others feel about this.
I wonder if there's some space for catering to those packages which
(ab)use LLVM_COMPAT as a proxy for 'Only these Clang versions are
supported' - usually to get `llvm_gen_dep` for appropriate toolchain
components.
For www-client/chromium, where we force `CC=clang` because it's the only
supported path upstream (and I simply don't have it in me to maintain
and GCC patches for three channels a week), I have been stung a few
times re: PATH manipulation where, for example, on an ~arch system with
multiple LLVM slots installed, and LTO enabled:
1. `CC=clang` is set, then `llvm-r1_pkg_setup` is called.
2. first llvm-r1 fixes CC=clang to CC=clang-19 because that's the latest
in PATH.
3. llvm-r1 uses LLVM_SLOT from the profile and does PATH manipulation
4. Compilation proceeds normally, however at link time `lld` is called
from the prefixed `/usr/lib/llvm/18/bin` resulting in an error like:
'... (Producer: 'LLVM19.1.4' Reader: 'LLVM 18.1.8')`
I suspect that this may come up on other systems where `CC=clang` is set
via make.conf and LTO is enabled (which is a good argument for avoiding
PATH manipulation by default).
I've worked around this in Chromium where we now call
`llvm-r1_pkg_setup` _then_ set CC and friends to include `LLVM_SLOT`
to enable consistent selection of tooling via `llvm_slot_x` USE. I see
some value in providing eclass consumers with a mechanism to select
appropriate Clang toolchain components consistently, be it an additional
variable or some manually-called `clang_setup` function that follows
much of the existing LLVM path prefix logic.
To play devil's advocate, I admit that Chromium (and maybe Firefox) are
probably the only packages to have a _need_ to force a Clang toolchain
(due to overheads and the need to get security updates for web browsers
to users quickly), and both can continue to do this outside the eclass -
it's the "LLVM eclass" not "Clang eclass" after all.
I don't really have strong opinions for packages that I maintain; I
actually need to go prod an upstream because they still only support
LLVM >14, so thanks for the reminder! I'm interested in seeing how
others use LLVM in packages and their opinions.
Hopefully some of this was useful!
Cheers,
Matt
On 4/12/24 01:32, Michał Górny wrote:
> Hello,
>
> TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
> and I'd like to replace that with something better (possibly in llvm-
> r2.eclass, given how fragile this thing is). So I'd like to discuss
> potential "better" solutions -- and particularly ask you what your LLVM-
> using packages need.
>
>
> Background
> ==========
>
> The current logic goes way back to llvm.eclass, and EAPIs that did not
> have native cross-build support. Back then, prepending the slotted LLVM
> bindir to PATH was the obvious way of getting software to find the right
> LLVM version.
>
> When I added EAPI 7 support, I went for prepending the following thing
> to PATH:
>
> ${ESYSROOT}/usr/lib/llvm/.../bin
>
> People doing cross will clearly notice the mistake here -- it's using
> binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
> but an ugly hack. What we're doing here is:
>
> 1. Relying on a fancy CMake behavior of locating CMake files relative to
> PATH, and
>
> 2. Relying on the package either not caring about LLVM executables or
> the system not being able to execute the executables in ESYSROOT
> and gracefully falling back to other locations in PATH.
>
> So what we're really doing is implicitly telling CMake to use:
>
> ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake
>
> Yes, it's awful. And yes, it already did backfire in the past, so I've
> ended up adding quite a complex logic to prevent these path
> manipulations from overriding the toolchain set by user. For example,
> if the user has CC=clang, that normally evalutes to clang-19, we now
> adjust CC so that it suddenly doesn't switch to clang-17 because
> the package uses libLLVM-17. Meh.
>
> When working on llvm-r1, I've focused on the more immediate problem of
> horribly complex and broken package dependencies, and forgot about this.
> I've only recalled the problem during the initial rust.eclass reviews,
> since it happened to copy that incorrect logic.
>
>
> Future options
> ==============
>
> Some of the options that already popped up during discussions include:
>
> 1. Stopping to export pkg_setup() entirely, and expecting people to
> explicitly pass the LLVM path to the build system, e.g. something like:
>
> -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"
>
> 2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
> and so on for CMake, or perhaps CMAKE_PREFIX_PATH).
>
> 3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
> ${PATH} instead of the whole LLVM tree. Note that we'd need to write
> our own since llvm-config is an executable, so we can't run the one from
> ESYSROOT, and we can't rely on BROOT having a match (or don't want to
> force a second copy of LLVM unnecessarily).
>
> Any other ideas? How does your package select LLVM version, and which
> of these options would work best for you?
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?)
2024-12-03 15:32 [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?) Michał Górny
2024-12-03 16:29 ` Gerion Entrup
2024-12-04 1:13 ` Matt Jolly
@ 2024-12-04 13:12 ` James Le Cuirot
2 siblings, 0 replies; 4+ messages in thread
From: James Le Cuirot @ 2024-12-04 13:12 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]
On Tue, 2024-12-03 at 16:32 +0100, Michał Górny wrote:
> Hello,
>
> TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
> and I'd like to replace that with something better (possibly in llvm-
> r2.eclass, given how fragile this thing is). So I'd like to discuss
> potential "better" solutions -- and particularly ask you what your LLVM-
> using packages need.
>
>
> Background
> ==========
>
> The current logic goes way back to llvm.eclass, and EAPIs that did not
> have native cross-build support. Back then, prepending the slotted LLVM
> bindir to PATH was the obvious way of getting software to find the right
> LLVM version.
>
> When I added EAPI 7 support, I went for prepending the following thing
> to PATH:
>
> ${ESYSROOT}/usr/lib/llvm/.../bin
>
> People doing cross will clearly notice the mistake here -- it's using
> binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
> but an ugly hack. What we're doing here is:
>
> 1. Relying on a fancy CMake behavior of locating CMake files relative to
> PATH, and
>
> 2. Relying on the package either not caring about LLVM executables or
> the system not being able to execute the executables in ESYSROOT
> and gracefully falling back to other locations in PATH.
>
> So what we're really doing is implicitly telling CMake to use:
>
> ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake
>
> Yes, it's awful. And yes, it already did backfire in the past, so I've
> ended up adding quite a complex logic to prevent these path
> manipulations from overriding the toolchain set by user. For example,
> if the user has CC=clang, that normally evalutes to clang-19, we now
> adjust CC so that it suddenly doesn't switch to clang-17 because
> the package uses libLLVM-17. Meh.
>
> When working on llvm-r1, I've focused on the more immediate problem of
> horribly complex and broken package dependencies, and forgot about this.
> I've only recalled the problem during the initial rust.eclass reviews,
> since it happened to copy that incorrect logic.
>
>
> Future options
> ==============
>
> Some of the options that already popped up during discussions include:
>
> 1. Stopping to export pkg_setup() entirely, and expecting people to
> explicitly pass the LLVM path to the build system, e.g. something like:
>
> -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"
>
> 2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
> and so on for CMake, or perhaps CMAKE_PREFIX_PATH).
>
> 3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
> ${PATH} instead of the whole LLVM tree. Note that we'd need to write
> our own since llvm-config is an executable, so we can't run the one from
> ESYSROOT, and we can't rely on BROOT having a match (or don't want to
> force a second copy of LLVM unnecessarily).
>
> Any other ideas? How does your package select LLVM version, and which
> of these options would work best for you?
I did some up with something similar to #3 back in 2019, but you were so dead
against it that I threw that work away. It wrapped around BROOT's llvm-config,
which you don't want to do, but that didn't seem to be the part you had a
problem with. Doing it that way would be easier, and maybe not such a big deal
since you need BROOT to having a matching slot to build LLVM in the first
place. Writing something based around pkg-config would be nicer though. I'd be
happy with either. I might even be able to help out. :)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 858 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread