public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Matt Jolly <kangie@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?)
Date: Wed, 4 Dec 2024 11:13:36 +1000	[thread overview]
Message-ID: <9baf649f-1368-48a2-bb22-bdc7e2d391b6@gentoo.org> (raw)
In-Reply-To: <d5489fa24ef3d1129540879e628120addb3af8ce.camel@gentoo.org>

Hi Michał,

I use llvm-r1 in a few packages, and for the intended purpose of 
consistently selecting and depending on a specific LLVM I've had
no major issues. Overall things work well, and the addition of
LLVM_SLOT USE_EXPAND for -r1 has made influencing the selection
as an end-user (and developer) so much more straightforward.

I don't think that the Rust eclass could work properly without
llvm-r1 given how tightly coupled dev-lang/rust is its vendored
LLVM version and the issues that we've encountered mixing those.

I'm not opposed to any of the options you've presented; they seem
reasonable and an improvement over the current situation.

At a high level:

- option 1: seems to put a lot of the burden on package maintainers to
   ensure that their build system is set up to support this and may
   require upstream changes.
- option 2: Seems "fine" for CMake based projects, but I have concerns
   about how other build systems will be catered to; is this something
   you could elaborate on - how might a non-CMake build system consume
   more generic variables? Are these widely used/supported and I'm just
   unaware of it?
- option 3: Seems quite straightforward, and I can see this being quite
   flexible in terms of being called within an ebuild if necessary
   (though consuming LLVM_SLOT might get ebuilds most of the way there?).

Overall perhaps some combination of options 2 and 3 might be the easiest
thing for eclass consumers to use flexibly at the cost of additional
eclass complexity. I'm interested in how others feel about this.

I wonder if there's some space for catering to those packages which
(ab)use LLVM_COMPAT as a proxy for 'Only these Clang versions are
supported' -  usually to get `llvm_gen_dep` for appropriate toolchain
components.

For www-client/chromium, where we force `CC=clang` because it's the only
supported path upstream (and I simply don't have it in me to maintain
and GCC patches for three channels a week), I have been stung a few
times re: PATH manipulation where, for example, on an ~arch system with 
multiple LLVM slots installed, and LTO enabled:

1. `CC=clang` is set, then `llvm-r1_pkg_setup` is called.
2. first llvm-r1 fixes CC=clang to CC=clang-19 because that's the latest
    in PATH.
3. llvm-r1 uses LLVM_SLOT from the profile and does PATH manipulation
4. Compilation proceeds normally, however at link time `lld` is called
    from the prefixed `/usr/lib/llvm/18/bin` resulting in an error like:
    '... (Producer: 'LLVM19.1.4' Reader: 'LLVM 18.1.8')`

I suspect that this may come up on other systems where `CC=clang` is set
via make.conf and LTO is enabled (which is a good argument for avoiding
PATH manipulation by default).

I've worked around this in Chromium where we now call
`llvm-r1_pkg_setup` _then_ set CC and friends to include `LLVM_SLOT`
to enable consistent selection of tooling via `llvm_slot_x` USE. I see
some value in providing eclass consumers with a mechanism to select
appropriate Clang toolchain components consistently, be it an additional
variable or some manually-called `clang_setup` function that follows 
much of the existing LLVM path prefix logic.

To play devil's advocate, I admit that Chromium (and maybe Firefox) are
probably the only packages to have a _need_ to force a Clang toolchain 
(due to overheads and the need to get security updates for web browsers 
to users quickly), and both can continue to do this outside the eclass -
it's the "LLVM eclass" not "Clang eclass" after all.

I don't really have strong opinions for packages that I maintain; I
actually need to go prod an upstream because they still only support
LLVM >14, so thanks for the reminder! I'm interested in seeing how
others use LLVM in packages and their opinions.

Hopefully some of this was useful!

Cheers,

Matt


On 4/12/24 01:32, Michał Górny wrote:
> Hello,
> 
> TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
> and I'd like to replace that with something better (possibly in llvm-
> r2.eclass, given how fragile this thing is).  So I'd like to discuss
> potential "better" solutions -- and particularly ask you what your LLVM-
> using packages need.
> 
> 
> Background
> ==========
> 
> The current logic goes way back to llvm.eclass, and EAPIs that did not
> have native cross-build support.  Back then, prepending the slotted LLVM
> bindir to PATH was the obvious way of getting software to find the right
> LLVM version.
> 
> When I added EAPI 7 support, I went for prepending the following thing
> to PATH:
> 
>    ${ESYSROOT}/usr/lib/llvm/.../bin
> 
> People doing cross will clearly notice the mistake here -- it's using
> binaries from ESYSROOT rather than BROOT!  Except it's not a mistake,
> but an ugly hack.  What we're doing here is:
> 
> 1. Relying on a fancy CMake behavior of locating CMake files relative to
> PATH, and
> 
> 2. Relying on the package either not caring about LLVM executables or
> the system not being able to execute the executables in ESYSROOT
> and gracefully falling back to other locations in PATH.
> 
> So what we're really doing is implicitly telling CMake to use:
> 
>    ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake
> 
> Yes, it's awful.  And yes, it already did backfire in the past, so I've
> ended up adding quite a complex logic to prevent these path
> manipulations from overriding the toolchain set by user.  For example,
> if the user has CC=clang, that normally evalutes to clang-19, we now
> adjust CC so that it suddenly doesn't switch to clang-17 because
> the package uses libLLVM-17.  Meh.
> 
> When working on llvm-r1, I've focused on the more immediate problem of
> horribly complex and broken package dependencies, and forgot about this.
> I've only recalled the problem during the initial rust.eclass reviews,
> since it happened to copy that incorrect logic.
> 
> 
> Future options
> ==============
> 
> Some of the options that already popped up during discussions include:
> 
> 1. Stopping to export pkg_setup() entirely, and expecting people to
> explicitly pass the LLVM path to the build system, e.g. something like:
> 
>    -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"
> 
> 2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
> and so on for CMake, or perhaps CMAKE_PREFIX_PATH).
> 
> 3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
> ${PATH} instead of the whole LLVM tree.  Note that we'd need to write
> our own since llvm-config is an executable, so we can't run the one from
> ESYSROOT, and we can't rely on BROOT having a match (or don't want to
> force a second copy of LLVM unnecessarily).
> 
> Any other ideas?  How does your package select LLVM version, and which
> of these options would work best for you?
> 
> 



  parent reply	other threads:[~2024-12-04  1:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 15:32 [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use LLVM?) Michał Górny
2024-12-03 16:29 ` Gerion Entrup
2024-12-04  1:13 ` Matt Jolly [this message]
2024-12-04 13:12 ` James Le Cuirot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9baf649f-1368-48a2-bb22-bdc7e2d391b6@gentoo.org \
    --to=kangie@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox