Gentoo Archives: gentoo-user

From: "Boyd Stephen Smith Jr." <bss03@××××××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] I have 146,000 files in lost+found. How do I sort them?
Date: Tue, 26 Sep 2006 13:25:53
Message-Id: 200609260820.28705.bss03@volumehost.net
1 On Monday 25 September 2006 22:55, Robert Persson <ireneshusband@×××××.com>
2 wrote about '[gentoo-user] I have 146,000 files in lost+found. How do I
3 sort them?':
4 > Am I likely to find many usable files in that /lost+found directory?
5
6 Maybe. I tried to recover a corrupted ext3 boot recently and was unable to
7 pull anything useful out of lost and found that was larger than a
8 symlink. :( If a number of files NOT in lost+found were corrupt, it's
9 likely most of the files in lost+found are corrupt as well.
10
11 That said, /boot data is generally easy to replace, so I put no effort into
12 recovering files that were corrupted. If the data was valuable, if might
13 be worth it to spend some time sorting those out.
14
15 > If I can, how can I best sift through them?
16
17 Carefully. :)
18
19 > Is there a utility, or
20 > something I could drop into a simple bash script, that would look at the
21 > first few bytes of the file and, say, identify it as a jpeg or an xml
22 > file, so that it could be given an appropriate file extension, deleted
23 > or moved?
24
25 As the other poster mentioned, the file utility is useful for identifying
26 the type of file. Keep in mind though that is only looks at the first few
27 bytes of the file, if there's corruption later on file won't notice.
28
29 > Or is there one that could distinguish a text file from a
30 > binary?
31
32 Of course, file does this to some extent. A MIME type of text/* is
33 generally text, while anything else is binary. But, file's output (by
34 default) isn't a simple "binary" or "text" string.
35
36 Some of the GNU utilities that are meant for text files will complain
37 before operating on a binary file, so you could use those for this task,
38 possibly. (I'm thinking of less and grep.) In particular,
39 grep '[^[:print:]]' should return true when run against a file that
40 contains non-printable characters (like control characters or NUL, and,
41 depending on locale, non-7-bit-clean characters).
42
43 > Are there any other strategies I could use to sift through these files
44 > (assuming it would be worth doing)?
45
46 Well, before you write some sort of bash script around file to rename
47 stuff, you'll probably want to remove anything that is clearly trash, like
48 device nodes or 0-length files. Something like:
49 find lost+found \! \( -type f -o -type d \ -o -type l \) -o -empty -delete
50 should work if you are using GNU find.
51
52 --
53 "If there's one thing we've established over the years,
54 it's that the vast majority of our users don't have the slightest
55 clue what's best for them in terms of package stability."
56 -- Gentoo Developer Ciaran McCreesh

Replies

Subject Author
Re: [gentoo-user] I have 146,000 files in lost+found. How do I sort them? Robert Persson <ireneshusband@×××××.com>