Gentoo Archives: gentoo-user

From: "Corentin “Nado” Pazdera" <nado@××××××××××.be>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Replacement for gcruft: gcrud
Date: Fri, 17 Aug 2018 12:16:10
Message-Id: f99a251397749798caff3237d9bd4be7@troglodyte.be
In Reply to: Re: [gentoo-user] Replacement for gcruft: gcrud by Andrew Udvare
1 August 17, 2018 1:09 AM, "Andrew Udvare" <audvare@×××××.com> wrote:
2
3 > The whitelist is the biggest work in progress right now. Most of what it lists from /etc for me is
4 > /etc/config-archive which AFAIK is not managed by Portage at all although Portage will place old
5 > files there? I don't use the feature because my /etc is controlled by Git. The stuff listed in
6 > /var/ is pretty accurate as there's a lot of old website cruft and this computer does not serve
7 > anything like that anymore.
8
9 Well, for example I use eselect-repository which puts repos in /var/dbr/repos, I put gentoo tree in
10 there as well and the whole tree is suggested for deletion.
11 A solution would be to read /etc/portage/repos.conf file(s) for repos location during the runtime
12 detection, or use portageq interface.
13 Or tell people to whitelist manually their repos location when the config file will be available ;)
14
15 You could add in whitelist directories containing a .keep file, although I'm not sure how to
16 specify it.
17 Same goes for git repositories, I’d rather delete a whole git repo or nothing at all inside, so
18 adding a rule which can interprets "pick parent dir of a .git dir to suggest deletion, ignore all
19 children of said parent".
20
21 > The idea is to move to everything in the whitelist.c file to a declarative (no code unless you
22 > count RE) configuration file. I have not decided on a format but I am leaning towards INI-style
23 > because GLib2 has a parser for that built-in. The config file will specify exact paths, RE, and
24 > globs. There will be a default dynamic list generated at runtime based on what packages you have
25 > installed (as gcruft had this feature).
26
27 That will be nice, waiting for it ;) Something basic might be enough for making batches of test
28 before choosing a definite format.
29
30 >> I also caught some wrongly listed files because of the multilib system with /lib symlink.
31 >> For example, dhcpcd declared /lib/dhcpcd/dhcpcd-hooks, thus the realpath /lib64/dhcpcd/dhcpcd-hooks
32 >> was listed in the removal suggestion. This should be fixed with profile 17.1
33 >
34 > The /lib vs /lib64 issue will be resolved in a later version. I think I need to use lstat()
35 > everywhere instead of stat(), or I can call realpath() prior to storing values in the set. This
36 > file should be whitelisted, but only if you have dhcpcd installed (I've long since moved to dhcpd).
37
38 I’m in favor of the realpath suggestion, this will be useful for any symlinked accessed path.
39
40 >> The log is so huge at the moment it is useless for me :/
41 >>
42 >> % wc -l out.log
43 >> 461575 out.log
44 >
45 > Any thoughts on how to simplify analysis?
46
47 A few, but I’m not sure if I have much which are /universal/ in gentoo systems.
48 Do you plan to integrate the sorting part in gcrud directly?
49 If so, I’d suggest bringing /usr/* stuff first to show, because un-owned files should be
50 exceptions.
51 Same goes for /lib, but stuff like kernel modules should be treated carefully, we can either
52 whitelist the whole /lib{,32,64}/modules, or try being smart and select old kernel modules only.
53 This might be tricky given the number of ways someone can manage them.
54
55 Also, here is small analysis of files locations by gcrud.
56
57 % cut -d/ -f2 out.log|uniq -c
58 295 etc
59 3309 lib64
60 1178 lib
61 13 opt
62 39586 usr
63 417194 var
64
65 /var containing my different repos, its logical it contains most occurences.
66 Next goes usr, containing another lib{,32,64} schema with /usr/lib pointing to /usr/lib64, with go
67 packages installed (in /usr/lib64/go).
68 With these informations, I suppose most will disappear when using realpath/switching to 17.1
69 profile.
70
71 Thanks for your work, this will probably a excellent tool in a few commits ;)
72
73 Regards,
74 Corentin “Nado” Pazdera