1 |
August 17, 2018 1:09 AM, "Andrew Udvare" <audvare@×××××.com> wrote: |
2 |
|
3 |
> The whitelist is the biggest work in progress right now. Most of what it lists from /etc for me is |
4 |
> /etc/config-archive which AFAIK is not managed by Portage at all although Portage will place old |
5 |
> files there? I don't use the feature because my /etc is controlled by Git. The stuff listed in |
6 |
> /var/ is pretty accurate as there's a lot of old website cruft and this computer does not serve |
7 |
> anything like that anymore. |
8 |
|
9 |
Well, for example I use eselect-repository which puts repos in /var/dbr/repos, I put gentoo tree in |
10 |
there as well and the whole tree is suggested for deletion. |
11 |
A solution would be to read /etc/portage/repos.conf file(s) for repos location during the runtime |
12 |
detection, or use portageq interface. |
13 |
Or tell people to whitelist manually their repos location when the config file will be available ;) |
14 |
|
15 |
You could add in whitelist directories containing a .keep file, although I'm not sure how to |
16 |
specify it. |
17 |
Same goes for git repositories, I’d rather delete a whole git repo or nothing at all inside, so |
18 |
adding a rule which can interprets "pick parent dir of a .git dir to suggest deletion, ignore all |
19 |
children of said parent". |
20 |
|
21 |
> The idea is to move to everything in the whitelist.c file to a declarative (no code unless you |
22 |
> count RE) configuration file. I have not decided on a format but I am leaning towards INI-style |
23 |
> because GLib2 has a parser for that built-in. The config file will specify exact paths, RE, and |
24 |
> globs. There will be a default dynamic list generated at runtime based on what packages you have |
25 |
> installed (as gcruft had this feature). |
26 |
|
27 |
That will be nice, waiting for it ;) Something basic might be enough for making batches of test |
28 |
before choosing a definite format. |
29 |
|
30 |
>> I also caught some wrongly listed files because of the multilib system with /lib symlink. |
31 |
>> For example, dhcpcd declared /lib/dhcpcd/dhcpcd-hooks, thus the realpath /lib64/dhcpcd/dhcpcd-hooks |
32 |
>> was listed in the removal suggestion. This should be fixed with profile 17.1 |
33 |
> |
34 |
> The /lib vs /lib64 issue will be resolved in a later version. I think I need to use lstat() |
35 |
> everywhere instead of stat(), or I can call realpath() prior to storing values in the set. This |
36 |
> file should be whitelisted, but only if you have dhcpcd installed (I've long since moved to dhcpd). |
37 |
|
38 |
I’m in favor of the realpath suggestion, this will be useful for any symlinked accessed path. |
39 |
|
40 |
>> The log is so huge at the moment it is useless for me :/ |
41 |
>> |
42 |
>> % wc -l out.log |
43 |
>> 461575 out.log |
44 |
> |
45 |
> Any thoughts on how to simplify analysis? |
46 |
|
47 |
A few, but I’m not sure if I have much which are /universal/ in gentoo systems. |
48 |
Do you plan to integrate the sorting part in gcrud directly? |
49 |
If so, I’d suggest bringing /usr/* stuff first to show, because un-owned files should be |
50 |
exceptions. |
51 |
Same goes for /lib, but stuff like kernel modules should be treated carefully, we can either |
52 |
whitelist the whole /lib{,32,64}/modules, or try being smart and select old kernel modules only. |
53 |
This might be tricky given the number of ways someone can manage them. |
54 |
|
55 |
Also, here is small analysis of files locations by gcrud. |
56 |
|
57 |
% cut -d/ -f2 out.log|uniq -c |
58 |
295 etc |
59 |
3309 lib64 |
60 |
1178 lib |
61 |
13 opt |
62 |
39586 usr |
63 |
417194 var |
64 |
|
65 |
/var containing my different repos, its logical it contains most occurences. |
66 |
Next goes usr, containing another lib{,32,64} schema with /usr/lib pointing to /usr/lib64, with go |
67 |
packages installed (in /usr/lib64/go). |
68 |
With these informations, I suppose most will disappear when using realpath/switching to 17.1 |
69 |
profile. |
70 |
|
71 |
Thanks for your work, this will probably a excellent tool in a few commits ;) |
72 |
|
73 |
Regards, |
74 |
Corentin “Nado” Pazdera |