Gentoo Archives: gentoo-project

From: David Seifert <soap@g.o>
To: gentoo-project@l.g.o
Subject: Re: [gentoo-project] call for agenda items -- council meeting 2021-05-09
Date: Sun, 09 May 2021 15:22:38
Message-Id: 0dd707925e0947632b430de6bd30e9ad700fc84c.camel@gentoo.org
In Reply to: Re: [gentoo-project] call for agenda items -- council meeting 2021-05-09 by Sam James
1 On Wed, 2021-04-28 at 01:56 +0100, Sam James wrote:
2 >
3 >
4 > > On 27 Apr 2021, at 19:56, Andreas K. Huettel <dilfridge@g.o>
5 > > wrote:
6 > >
7 > > I'd like to kick off a discussion whether LTO should be considered
8 > > "supported". With that I essentially mean that bugs involving LTO
9 > > should be considered valid, and fixes (be it only stripping -flto
10 > > from flags, or similar solutions) should be committed to the tree.
11 > >
12 >
13 > Forgive me for giving a tiny bit of unstructured opinion about
14 > USE=lto, before I dive into the actual proposal:
15 >
16 > 1) I’m really happy to use LTO whenever it is supported upstream (just
17 > like -O3, etc) but I don't use it out of thin air.
18 >
19 > 2) For that reason, I personally like it when USE=lto exists even when
20 > no specific build system hacks are required (because it tells me
21 > “upstream will help with bugs” and so on) but I completely understand
22 > this is
23 >   a bit at odds with what we usually do, and therefore is something I
24 > just need to get used to not having.
25 >
26 > Of course, this problem goes away if we’re going to generally
27 > encourage tinderboxes and general LTO usage, just like we did with as-
28 > needed.
29 >
30 > > I would like to clarify this before possibly suggesting an
31 > > initiative to make the Gentoo repository LTO-safe (similar to what
32 > > we did years ago with --as-needed).
33 >
34 > I’d be interested in if slyfox or Soap had any input on heuristics to
35 > help determine if something is likely to be unsafe. LTO is really good
36 > at “provoking” undefined behaviour. Build completing means very little
37 > in terms of success here.
38 >
39 > I don’t really want to go around running UBSAN on everything to know
40 > it’s safe to use it. Polynomial-C mentioned data corruption at runtime
41 > with some packages in #gentoo-dev the other day too (this kind of
42 > experience is very real and we need to mitigate it).
43 >
44 > Obviously that would be a good candidate for stripping out LTO, but
45 > how are we supposed to notice this stuff if it only happens under
46 > certain circumstances?
47 >
48 > My rough plan would be:
49 > - Coordinate via e.g. wiki pages (and IRC as usual)
50 > - *Strong* focus on packages with test suites so that we can get some
51 > idea of whether it’s working correctly with LTO. Let’s ignore those
52 > without tests in the first round(s).
53 > - Provide some rough documentation for developers on how to build with
54 > UBSAN which we can use at least for critical applications
55 > - For codebases which are known to be “rough” (and we would include
56 > feedback from the LTO overlay [0] here), we’d possibly filter LTO
57 > flags proactively (at least if they’re critical packages).
58 >
59 > >
60 > > Background is, just about every binary distribution out there builds
61 > > with LTO by default now. It's not so great if we then keep telling
62 > > people "LTO is dangerous".
63 >
64 > Right. Fedora are doing this and Clear Linux has been doing this for a
65 > very long time too. What I find interesting is that I’ve never
66 > actually come
67 > across any patches in either to fix LTO issues, which either means I’m
68 > (un)lucky or they’re not hitting issues so often?
69 >
70 > Obviously, we end up hitting more than other people because of often
71 > exotic configurations on the user side, but it is what it is.
72 >
73 > This is one of those situations where reaching out to some folks we
74 > know in other distros for some (unstructured) thoughts might not be a
75 > bad idea - just
76 > to find out some e.g. heads up on problematic codebases.
77 >
78 > Best,
79 > sam
80 >
81 > [0] https://github.com/InBetweenNames/gentooLTO
82 >
83 > >
84 > > Cheers,
85 > > Andreas
86 > >
87 > > (Yes I'm aware of the LTO overlay. It may be a great source.)
88 > >
89 > > --
90 > > Andreas K. Hüttel
91 > > dilfridge@g.o
92 > > Gentoo Linux developer
93 > > (council, toolchain, base-system, perl, libreoffice)
94 >
95
96 My corollary from this discussion is: if we decide to support LTO, why
97 not also -O3?
98
99 Fundamentally, the issue with LTO vs as-needed is that the former
100 triggers failures at runtime, whereas the latter always manifests at
101 link-time. Fixing LTO in general is much more involved, requires more
102 knowledge of C/C++, and requires knowledge of sanitizer instrumentation
103 and what not. Finally, it might expose a number of insidious security
104 vulnerabilities.
105
106 The activation energy for getting this working in most of the tree will
107 be an order of magnitude greater than for as-needed, and people should
108 keep this in mind.
109
110 David