Gentoo Archives: gentoo-user

From: Alan McKinnon <alan.mckinnon@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] slocate masked
Date: Wed, 17 Nov 2010 21:57:17
Message-Id: 201011172355.29238.alan.mckinnon@gmail.com
In Reply to: Re: [gentoo-user] slocate masked by Paul Hartman
1 Apparently, though unproven, at 23:00 on Wednesday 17 November 2010, Paul
2 Hartman did opine thusly:
3
4 > On Wed, Nov 17, 2010 at 2:35 PM, Mick <michaelkintzios@×××××.com> wrote:
5
6 > > Why is the second time so much faster? The size of the derived db was
7 > > the same on both occasions.
8 >
9 > I guess caching like Volker said too. What happens if you do something
10 > like this twice:
11 >
12 > sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
13
14 Now I'm intrigued. I did some quick and nasty tests.
15
16 First, mlocate's updatedb. No measures taken to invalidate caches etc:
17
18 # time updatedb
19 real 0m39.265s
20 user 0m2.245s
21 sys 0m0.228s
22
23
24 Then unmerge mlocate, emerge slocate, delete all dbs, run slocate's updatedb
25 twice:
26
27 # rm /var/lib/[ms]locate/*db
28 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
29 real 1m35.365s
30 user 0m5.941s
31 sys 0m0.383s
32 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
33 real 1m34.929s
34 user 0m5.925s
35 sys 0m0.377s
36
37 slocate seems quicker than the few tests I'd already done with mlocate and has
38 no optimizations to re-use existing correct data in the db. Now unmerge
39 slocate, merge mlocate, do not delete dbs and run mlocate's updatedb twice:
40
41 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
42 real 3m50.574s
43 user 0m7.277s
44 sys 0m0.361s
45 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
46 real 1m5.830s
47 user 0m2.088s
48 sys 0m0.173s
49
50 Second run definitely quicker as it only has to read the fs, not write the
51 entire index as well. But that initial run ... The old slocate db was still
52 around, possibly affecting the first run, so delete both db's and run
53 mlocate's updatedb twice:
54
55 # rm /var/lib/[ms]locate/*db
56 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
57 real 3m51.592s
58 user 0m7.249s
59 sys 0m0.350s
60 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
61 real 1m7.662s
62 user 0m1.997s
63 sys 0m0.159s
64
65 Almost identical to the prior test, so the presence of slocate's db has no
66 effect on mlocate. Then I realized I hadn't measured how long they took to
67 reindex a largely cache'd fs so I tried that with both, deleting the db's at
68 each test:
69
70 slocate:
71 # rm /var/lib/[ms]locate/*db
72 rm: cannot remove `/var/lib/[ms]locate/*db': No such file or directory
73 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
74 real 1m34.341s
75 user 0m5.929s
76 sys 0m0.397s
77 # time updatedb
78 real 0m2.454s
79 user 0m0.855s
80 sys 0m1.569s
81
82 mlocate:
83 # rm /var/lib/[ms]locate/*db
84 # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb
85 real 3m54.792s
86 user 0m7.215s
87 sys 0m0.350s
88 # time updatedb
89 real 0m0.538s
90 user 0m0.302s
91 sys 0m0.232s
92
93 0.5 second vs 2.5 seconds. Wow.
94
95 Conclusions:
96
97 1. mlocate is slow at building it's db from scratch - about 250% as long as
98 slocate on the same task.
99 2. mlocate is faster at reindexing a largely-unchanged fs - it does it in
100 about 66% of the time slocate took.
101 3. mlocate is insanely quick at reindexing a db that is in cache.
102
103 #1 is are - most systems will only do it once
104 #3 is silly and does not represent anything close to reality
105 #2 is pretty realistic and a 33% performance boost is significant
106
107 I have no idea where the speed increase in #3 comes from. This is an ext4 fs -
108 does ext4 keep an in-memory hash of inodes it reads? It seems to me that would
109 be a very clever and very useful thing for an fs to do.
110
111
112 --
113 alan dot mckinnon at gmail dot com

Replies

Subject Author
Re: [gentoo-user] slocate masked Mick <michaelkintzios@×××××.com>