Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] rfc: usrmerge script
Date: Wed, 24 Mar 2021 07:48:50
Message-Id: ff811abd4e7bdfae50a9cd3ec770447906dd0a7b.camel@gentoo.org
In Reply to: Re: [gentoo-dev] rfc: usrmerge script by William Hubbs
1 On Wed, 2021-03-24 at 00:20 -0500, William Hubbs wrote:
2 > On Tue, Mar 23, 2021 at 10:23:11AM +0100, Michał Górny wrote:
3 > > On Sun, 2021-03-21 at 12:39 -0500, William Hubbs wrote:
4 > > > All,
5 > > >
6 > > > the following is a script which will migrate a Gentoo system to the usr
7 > > > merge layout. This is similar to the unsymlink-lib tool used to migrate
8 > > > a system from the 17.0 to the 17.1 profiles.
9 > > >
10 > > > I'm attaching it here to get some comments before I package it, so
11 > > > please let me know if I have missed something.
12 > >
13 > > To be honest, I don't think critical system modifications should be done
14 > > in shell script, and especially not via a fringe non-standards complaint
15 > > shell implementation in busybox. Even if you can assume you make no
16 > > mistakes, shell is unreliable by design. For example, your script may
17 > > start behaving in unexpected ways if you run out of space.
18 >
19 > I went with busybox shell in this case because busybox is a static
20 > binary and I figured with everything being moved around on the system it
21 > would be a good choice.
22 >
23 > I could pretty easily go with straight sh/bash, I just was focusing on
24 > bb since the commands are built in.
25 >
26 > You are right about shell behaving in weird ways if you run out of
27 > space, but that's true for anything that happens to be running when a
28 > system runs out of space.
29
30 This is not true. Most of the programming languages can behave sanely
31 in out-of-space conditions. Sure, I/O fails one way or another but if
32 it's just a matter of proper error handling. Shells on the other hand,
33 tend to use temporary files under the hood and can misbehave
34 in unexpected ways.
35
36 > That is why I put the rollback directory in /var by
37 > default figuring there is more space there than in /, especially on
38 > systems with separate /var.
39
40 What really can help is reflinking on filesystems supporting that.
41
42 > The run_command wrapper in the script causes an immediate exit when the
43 > command it runs fails so things don't go any
44 > further than the failing command, so I'm not sure what else I can do
45 > with this situation. One option is to always echo the command we are
46 > about to run then just not run it if -d is specified. That means you
47 > would always see the commands the script is running.
48
49 Did you see the verbose explanatory messages unsymlink-lib produces
50 on every error condition? Compared to this, your script is printing
51 Windows-level 'error figure-it-out-yourself'.
52
53 > > You don't seem to be handling file collisions at all. Even today we
54 > > have files like /bin/bzip2 and /usr/bin/bzip2, not to mention shared
55 > > libraries. Silently ignoring the problem or requiring the users to
56 > > manually ensure their system is clean is not going to solve it.
57
58 > I have added -i to the cp commands in my latest version of this so I'll
59 > see when we have duplicate names, and yes that is an issue, even without
60 > the /usr merge.
61
62 I don't think that's the best UI for this possible. I doubt that
63 a random user running the script will figure out why it is asking about
64 file replacements.
65
66 > I'm curious why we are duplicating all of these names in the first
67 > place honestly.
68 >
69 > One example of this is in coreutils, and we even say in the ebuild
70 > comment that we need to figure out why we are doing it.
71 >
72 > There are a couple of ways forward for this.
73 >
74 > 1) attempt to open bugs on the packages and remove the duplicate names
75 > by dropping symlinks and putting the files in their cannonical
76 > locations. this could lead to breakages if one of the alternate names is
77 > expected by something until the /usr merge is done.
78 >
79 > 2) try to guess at resolving the duplicate names as part of the
80 >    usr merge process.
81 >
82 > a. symmlinks in /bin or /sbin that point to the same name in /usr/bin or
83 > /usr/sbin should be removed.
84 > b. symlinks in /usr/bin or /usr/sbin that point to the same name in /
85 > should be removed.
86 > c. Other symlinks that I can think of should be preserved.
87 >
88 > Which path for handling this is best? Do you have any thoughts?
89
90 This is a thing that should have been researched before writing
91 a script.
92
93 One case I'm aware of is using symlinks to replace tools with
94 alternatives, e.g. /usr/bin/bzip2 -> lbzip2 (/bin/bzip2 remains
95 as the original).
96
97 I don't know what's the best way forward.
98
99 >
100 > > You don't seem to provide any helpful messages. When things fail, user
101 > > will be left in the blue with an error message from some system tool (or
102 > > rather, cheap-ass busybox rewrite, I guess).
103 >
104 > If the rollback directory is populated, you can use -r to recover,
105 > and the script will not let you perform the /usr merge if the rollback
106 > directory is not populated.
107 >
108 > > Also, have you verified that busybox's cp(1) actually preserves all file
109 > > properties (including xattrs, ACLs, caps...)?
110 >
111 > The documentation for busybox states that -p preserves file attributes
112 > if possible, and by doing ls -l in the chroot I can see that
113 > ownership/permissions are preserved. So, logically I would guess it
114 > preserves the others.
115
116 I don't see how you would come to this conclusion. Just because some
117 tool is doing the absolute minimum it doesn't mean it does anything
118 else, and especially operations that require additional syscalls.
119
120 > If switching back to non-busybox is fine for this operation, I have no
121 > problem doing that either.
122 >
123 > > Please don't forget to include tests with it. Docker's good for testing
124 > > stuff like this.
125 >
126 > Docker can't be used during src_test that I'm aware of, so I'm not sure
127 > how Docker could be used to test this.
128
129 You can test it outside of ebuild. Some automated testing is better
130 than no testing at all.
131
132 --
133 Best regards,
134 Michał Górny

Replies

Subject Author
Re: [gentoo-dev] rfc: usrmerge script William Hubbs <williamh@g.o>