1 |
On Wed, 2021-03-24 at 00:20 -0500, William Hubbs wrote: |
2 |
> On Tue, Mar 23, 2021 at 10:23:11AM +0100, Michał Górny wrote: |
3 |
> > On Sun, 2021-03-21 at 12:39 -0500, William Hubbs wrote: |
4 |
> > > All, |
5 |
> > > |
6 |
> > > the following is a script which will migrate a Gentoo system to the usr |
7 |
> > > merge layout. This is similar to the unsymlink-lib tool used to migrate |
8 |
> > > a system from the 17.0 to the 17.1 profiles. |
9 |
> > > |
10 |
> > > I'm attaching it here to get some comments before I package it, so |
11 |
> > > please let me know if I have missed something. |
12 |
> > |
13 |
> > To be honest, I don't think critical system modifications should be done |
14 |
> > in shell script, and especially not via a fringe non-standards complaint |
15 |
> > shell implementation in busybox. Even if you can assume you make no |
16 |
> > mistakes, shell is unreliable by design. For example, your script may |
17 |
> > start behaving in unexpected ways if you run out of space. |
18 |
> |
19 |
> I went with busybox shell in this case because busybox is a static |
20 |
> binary and I figured with everything being moved around on the system it |
21 |
> would be a good choice. |
22 |
> |
23 |
> I could pretty easily go with straight sh/bash, I just was focusing on |
24 |
> bb since the commands are built in. |
25 |
> |
26 |
> You are right about shell behaving in weird ways if you run out of |
27 |
> space, but that's true for anything that happens to be running when a |
28 |
> system runs out of space. |
29 |
|
30 |
This is not true. Most of the programming languages can behave sanely |
31 |
in out-of-space conditions. Sure, I/O fails one way or another but if |
32 |
it's just a matter of proper error handling. Shells on the other hand, |
33 |
tend to use temporary files under the hood and can misbehave |
34 |
in unexpected ways. |
35 |
|
36 |
> That is why I put the rollback directory in /var by |
37 |
> default figuring there is more space there than in /, especially on |
38 |
> systems with separate /var. |
39 |
|
40 |
What really can help is reflinking on filesystems supporting that. |
41 |
|
42 |
> The run_command wrapper in the script causes an immediate exit when the |
43 |
> command it runs fails so things don't go any |
44 |
> further than the failing command, so I'm not sure what else I can do |
45 |
> with this situation. One option is to always echo the command we are |
46 |
> about to run then just not run it if -d is specified. That means you |
47 |
> would always see the commands the script is running. |
48 |
|
49 |
Did you see the verbose explanatory messages unsymlink-lib produces |
50 |
on every error condition? Compared to this, your script is printing |
51 |
Windows-level 'error figure-it-out-yourself'. |
52 |
|
53 |
> > You don't seem to be handling file collisions at all. Even today we |
54 |
> > have files like /bin/bzip2 and /usr/bin/bzip2, not to mention shared |
55 |
> > libraries. Silently ignoring the problem or requiring the users to |
56 |
> > manually ensure their system is clean is not going to solve it. |
57 |
|
58 |
> I have added -i to the cp commands in my latest version of this so I'll |
59 |
> see when we have duplicate names, and yes that is an issue, even without |
60 |
> the /usr merge. |
61 |
|
62 |
I don't think that's the best UI for this possible. I doubt that |
63 |
a random user running the script will figure out why it is asking about |
64 |
file replacements. |
65 |
|
66 |
> I'm curious why we are duplicating all of these names in the first |
67 |
> place honestly. |
68 |
> |
69 |
> One example of this is in coreutils, and we even say in the ebuild |
70 |
> comment that we need to figure out why we are doing it. |
71 |
> |
72 |
> There are a couple of ways forward for this. |
73 |
> |
74 |
> 1) attempt to open bugs on the packages and remove the duplicate names |
75 |
> by dropping symlinks and putting the files in their cannonical |
76 |
> locations. this could lead to breakages if one of the alternate names is |
77 |
> expected by something until the /usr merge is done. |
78 |
> |
79 |
> 2) try to guess at resolving the duplicate names as part of the |
80 |
> usr merge process. |
81 |
> |
82 |
> a. symmlinks in /bin or /sbin that point to the same name in /usr/bin or |
83 |
> /usr/sbin should be removed. |
84 |
> b. symlinks in /usr/bin or /usr/sbin that point to the same name in / |
85 |
> should be removed. |
86 |
> c. Other symlinks that I can think of should be preserved. |
87 |
> |
88 |
> Which path for handling this is best? Do you have any thoughts? |
89 |
|
90 |
This is a thing that should have been researched before writing |
91 |
a script. |
92 |
|
93 |
One case I'm aware of is using symlinks to replace tools with |
94 |
alternatives, e.g. /usr/bin/bzip2 -> lbzip2 (/bin/bzip2 remains |
95 |
as the original). |
96 |
|
97 |
I don't know what's the best way forward. |
98 |
|
99 |
> |
100 |
> > You don't seem to provide any helpful messages. When things fail, user |
101 |
> > will be left in the blue with an error message from some system tool (or |
102 |
> > rather, cheap-ass busybox rewrite, I guess). |
103 |
> |
104 |
> If the rollback directory is populated, you can use -r to recover, |
105 |
> and the script will not let you perform the /usr merge if the rollback |
106 |
> directory is not populated. |
107 |
> |
108 |
> > Also, have you verified that busybox's cp(1) actually preserves all file |
109 |
> > properties (including xattrs, ACLs, caps...)? |
110 |
> |
111 |
> The documentation for busybox states that -p preserves file attributes |
112 |
> if possible, and by doing ls -l in the chroot I can see that |
113 |
> ownership/permissions are preserved. So, logically I would guess it |
114 |
> preserves the others. |
115 |
|
116 |
I don't see how you would come to this conclusion. Just because some |
117 |
tool is doing the absolute minimum it doesn't mean it does anything |
118 |
else, and especially operations that require additional syscalls. |
119 |
|
120 |
> If switching back to non-busybox is fine for this operation, I have no |
121 |
> problem doing that either. |
122 |
> |
123 |
> > Please don't forget to include tests with it. Docker's good for testing |
124 |
> > stuff like this. |
125 |
> |
126 |
> Docker can't be used during src_test that I'm aware of, so I'm not sure |
127 |
> how Docker could be used to test this. |
128 |
|
129 |
You can test it outside of ebuild. Some automated testing is better |
130 |
than no testing at all. |
131 |
|
132 |
-- |
133 |
Best regards, |
134 |
Michał Górny |