Gentoo Archives: gentoo-user

From: Florian Philipp <lists@×××××××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Fast file system for cache directory with lot's of files
Date: Tue, 14 Aug 2012 15:12:15
Message-Id: 502A6A39.4000800@binarywings.net
In Reply to: Re: [gentoo-user] Fast file system for cache directory with lot's of files by Daniel Troeder
1 Am 14.08.2012 15:54, schrieb Daniel Troeder:
2 > On 14.08.2012 11:46, Neil Bothwick wrote:
3 >> On Tue, 14 Aug 2012 10:21:54 +0200, Daniel Troeder wrote:
4 >>
5 >>> There is also the possibility to write a really small daemon (less than
6 >>> 50 lines of C) that registers with inotify for the entire fs and
7 >>> journals the file activity to a sqlite-db.
8 >>
9 >> sys-process/incron ?
10 > Uh... didn't know that one! ... very interesting :)
11 >
12 > Have you used it?
13 > How does it perform if there are lots of modifications going on?
14 > Does it have a throttle against fork bombing?
15 > must-read-myself-a-little.....
16 >
17 > A incron line
18 > # sqlite3 /file.sql 'INSERT filename, date INTO table'
19 > would be inefficient, because it spawn lots of processes, but it would
20 > be very nice to simply test out the idea. Then a
21 > # sqlite3 /file.sql 'SELECT filename FROM table SORTBY date < date-30days'
22 > or something to get the files older than 30 days, and voilá :)
23 >
24 >
25
26 Maybe inotifywait is better for this kind of batch job.
27
28 Collecting events:
29 inotifywait -rm -e CREATE,DELETE --timefmt '%s' --format \
30 "$(printf '%%T\t%%e\t%%w%%f')" /tmp > events.tbl
31 # the printf is there because inotifywait's format does not
32 # recognize common escapes like \t
33 # Output format:
34 # Seconds since epoch \t CREATE/DELETE \t file name \n
35
36 Filtering events:
37 sort --stable -k3 events.tbl |
38 awk '
39 function update() {
40 line=$0; exists= $2=="DELETE" ? 0 : 1; file=$3
41 }
42 NR==1{ update(); next }
43 { if($3!=file && exists==1){ print line } update() }'
44 # Sorts by file name while preserving temporal order.
45 # Uses awk to suppress output of files that have been deleted.
46 # Output: Last CREATE event for each existing file
47
48 Retrieving files created 30+ days ago:
49 awk -v newest=$(date -d -5seconds +%s) '
50 $1>newest{ nextfile }
51 { print $3 }'
52
53 Remarks:
54
55 The awk scripts need some improvement if you have to handle whitespaces
56 in filenames but with the input format, it should be able to work with
57 everything except newlines.
58
59 Inotifywait itself is utterly useless when dealing with newlines in file
60 names unless you want to put some serious effort into sanitizing the output.
61
62 Regards,
63 Florian Philipp

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-user] Fast file system for cache directory with lot's of files Florian Philipp <lists@×××××××××××.net>