Gentoo Archives: gentoo-dev

From: Sam James <sam@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] You currently cannot smoothly upgrade a 4 months old Gentoo system
Date: Wed, 03 Nov 2021 23:19:45
Message-Id: E9EF988F-B8B0-4CF4-8171-5D3D60BA9043@gentoo.org
In Reply to: [gentoo-dev] You currently cannot smoothly upgrade a 4 months old Gentoo system by Thomas Deutschmann
1 > On 3 Nov 2021, at 15:03, Thomas Deutschmann <whissi@g.o> wrote:
2 >
3 > Hi,
4 >
5 > it is currently not possible to smoothly run a world upgrade on a 4 months old system which doesn't even have a complicated package list:
6 > [snip]
7 >
8 > This is not about finding solution to upgrade the system (in this case it was enough to force PYTHON_TARGETS=python3_8 for portage). This is about raising awareness that Gentoo is a rolling distribution and that we guarantee users to be able to upgrade their system when they do world upgrades just once a year (remember: in my case the last world upgrade is just 4 months old!). If they cannot upgrade their system without manual intervention, we failed to do our job.
9 >
10 > Situations like this will disqualify Gentoo for any professional environment like this will break automatic upgrades and you cannot roll individual fixes for each possible situation via CFM tools like Salt, Ansible, Puppet or Chef.
11 >
12 > It would be very appreciated if everyone will pay more attention to this in future. We can do better. In most cases we can avoid problems like this by keeping older ebuilds around much longer for certain key packages to help with upgrades.
13
14
15 I agree wholeheartedly with this and thank you for raising it.
16
17 ## Remark on some previous discussion
18
19 First, let me just mention that I think it's been on some of our minds but we need to go a bit further with formalising matters. It was brought up at the end of the September 2021 council meeting as a footnote:
20 ```
21 [21:16:56] <@sam_> I'd like to consider "upgrade lifcycles" at some point but I don't have notes ready for now. Mainly just about formalising efforts to support upgrades for X period and to try document a procedure for e.g. new EAPI versions and bootstrap packages not having new EAPIs for a while, and such.
22 [21:17:09] <@sam_> So, no, not right now, but I'd welcome any thoughts post-meeting while I consider it more
23 [21:17:33] <@sam_> The gist is to have a checklist so that we don't "get excited" like with EAPI 8 and end up making upgrades hard for people
24 [21:17:43] <@sam_> I think the GLEP we recently approved helps with that
25 ```
26
27 I started working on some notes too on possible improvements: https://wiki.gentoo.org/wiki/User:Sam/TODO#Improving_upgrades. (I wanted to mention all of this here because
28 it's easy to lose track of e.g. council meeting references on a topic, so it's easy to find it in the thread now.)
29
30 ## Summary of the two common cases
31
32 Now, in terms of the common issues regarding upgrades, I think we have two (to be clear, not trying to "fix your problem" -- just bring to bear some of the
33 support experience I've had from #gentoo and so on):
34
35 1) World upgrades which can't complete due to new EAPIs (one's Portage lacks support for e.g. EAPI 8 and hence cannot read ebuilds)
36
37 I'm open to more broad measures about usage of new EAPIs in ~arch / stable (say, e.g. the first Portage supporting EAPI N should sit in
38 ~arch for 4/6/??? months before any ebuilds should use it?), but I think this is a drastic measure we might be able to avoid. Let's keep it
39 in mind in case we do need it though.
40
41 My general thinking on this is that it doesn't matter _too much_(?) as long as one can upgrade Portage without hassle. A lot of our
42 users seem to know to try upgrade Portage if they can't upgrade their system due to new EAPIs, but they then fall down due to
43 cryptic errors (see my next point). We could also improve the "unknown EAPI" error if necessary to make this more clear.
44
45 TL;DR: We might be able to leverage a more drastic option, but my hope is we can avoid any direct action in handling 1) if we deal
46 with the next point I'm about to make (2)).
47
48 2) Portage often can't upgrade itself when there's "pending global PYTHON_TARGETS changes" (e.g. when we change the default value of
49 PYTHON_TARGETS in the profiles (like from Python 3.8 to Python 3.9))
50
51 This one is far trickier. I've started documenting common hacks/methods at https://wiki.gentoo.org/wiki/User:Sam/Portage_help/Upgrading_Portage#Solution
52 which has been rather useful in #gentoo and on the forums (it's been nice to see links on those and other similar pages pop up on /r/gentoo).
53
54 Portage is written in Python and has dependencies in Python. A lot of them are optional (which is why in the wiki page
55 I linked to, I suggest emerge --syncing and then turning off USE=rsync-verify temporarily to reduce dependencies), but
56 I don't think this is particularly comforting to a user who just wants to upgrade Portage. They don't necessarily realise
57 they need to toggle one or *several* flags on Portage to make it work.
58
59 dilfridge has been advocating for some time that we try look at some form of a "static Portage" copy (possibly
60 vendoring/bundling all Python dependencies) to completely decouple the Portage ebuilds from the Python
61 eclasses other than needing a (modern) Python 3 interpreter.
62
63 [I've filed a bug for this here: https://bugs.gentoo.org/821511].
64
65 I really feel like this is one of the big things we need to tackle. Upgrading Portage unlocks newer
66 EAPIs and allows us to even discuss world upgrades.
67
68 (Using an older Portage to try upgrade world with any non-trivial @world set (chosen, user-specified packages)
69 is likely to be a fool's errand -- folks have already said that if _anything_ is using a new EAPI, it's going to affect
70 some users and result in confusing errors.)
71
72 ## Solutions
73
74 * News item when a new EAPI is released explaining how to upgrade Portage in case of emergency / inability
75 to upgrade Portage.
76
77 We can describe the steps at https://wiki.gentoo.org/wiki/Project:Portage/Fixing_broken_portage:
78
79 This would also flag to users that they should upgrade Portage sooner-rather-than-later even if they aren't
80 currently willing/able to fully upgrade the rest of their system.
81
82 * We may want to include a 'rescue-portage' script on the system which downloads the latest Portage (would need
83 to use a symlink or something to reliably get the latest version).
84
85 * Investigate reducing Portage's dependencies.
86
87 * Mitigate PYTHON_TARGETS profile change impact:
88 ** I don't love this idea but one possible measure is that we always have two PYTHON_TARGETS set
89 at all times (this would double build times for a fair amount of packages).
90 ** Or we do this just for Portage and its dependencies.
91 ** Or we have a new portage-minimal ebuild (to simplify matters) which always has some/all targets enabled,
92 which will have few/no Python dependencies.
93
94 [Note that in the past, we weren't consistent about putting out news items for this change. We're doing
95 that now at least.
96
97 The matter has got a bit worse because of Python upstream's release cycle changing.]
98
99 * Implement at least a 4-6 month(?) delay on using new EAPIs after a new version of Portage
100 supports it (the timer resetting once it hits stable too).
101
102 I wasn't sure about this at first, but actually, the PYTHON_TARGETS stuff _should_ be
103 fine for the most part as long as we make sure the tree is mostly/entirely ready before
104 flipping the switch.
105
106 [This could actually help with a fair amount of the problems (other than "general upgrade
107 issues" like conflicts) except when a new EAPI comes along with a targets change,
108 and if we're looking to support upgrades over a year or two years, that's.. probably
109 going to coincide.]
110
111 ## TL;DR
112
113 I don't think we can avoid thinking about Portage's entanglement / relationship
114 with PYTHON_TARGETS. Banning use of new EAPIs immediately will not magically
115 make it easy to upgrade Portage itself.
116
117 But the combination of a new EAPI + PYTHON_TARGETS changes in profiles
118 is pretty lethal.
119
120 I've got a few ideas above and I hope we can discuss some of them, or even better,
121 someone has other proposals.
122
123 best,
124 sam

Attachments

File name MIME type
signature.asc application/pgp-signature