Gentoo Archives: gentoo-dev

From: Angelo Arrifano <miknix@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC][NEW] Utility to find orphaned files
Date: Sun, 25 Apr 2010 17:10:38
Message-Id: 4BD47770.8050308@gentoo.org
In Reply to: Re: [gentoo-dev] [RFC][NEW] Utility to find orphaned files by Yuri Vasilevski
1 On 25-04-2010 17:34, Yuri Vasilevski wrote:
2 > Hello,
3 >
4 > On Sun, 25 Apr 2010 13:18:25 +0200
5 > Angelo Arrifano <miknix@g.o> wrote:
6 >
7 >> Hello developers developers and developers,
8 >>
9 >> Ever wondered how much crap is left in your X-years old Gentoo box?
10 >>
11 >> I just developed a python utility to efficiently find orphaned files
12 >> in the system. By orphaned files I mean the files that are present on
13 >> system directories and don't belong to any installed package.
14 >>
15 >> The package builds a virtual filesystem (cache) on the RAM using
16 >> python hash tables. Then it uses the cache to find the ownership of
17 >> files inside user-specified dirs.
18 >>
19 >> Building the cache takes less than 10 seconds here in a system with
20 >> 1366 installed packages.
21 >>
22 >> This is not intended to be a finished program yet, I'm looking forward
23 >> for your constructive commentaries.
24 >
25 > There is a tool that does that, qfile from app-portage/portage-utils.
26 > Check the "-o, --orphans * List orphan files" option.
27 >
28 > It's not as straight forward as it could be, as it checks only for
29 > files specified as arguments or read from file.
30 >
31 > But you can trivially use it like:
32 > # find /dir/you/want/to/check/for/orphans | qfile -o -f -
33 >
34 > Best,
35 > Yuri.
36 >
37
38 Based on the comments so far, I'll try to make my PoC a better tool.
39 My primary objective is to make this some kind of disk cleanup utility
40 for Gentoo boxens. I don't expect Gentoo systems to be *that* polluted
41 but sometimes we all have to do ugly things to fix broken systems real
42 fast. - If you know what I mean.
43
44 There are other things that came to my mind, like using stored hashes to
45 check the system files integrity (as in security).
46
47 My next steps in regard to this utility will be:
48 * Follow harring suggestion and use available PM API.
49 * Make the application handle symlinks so we start getting a more
50 informative output.
51 * To store the generated cache on disk and to only regenerate it if needed.
52
53 Regards,
54 - Angelo