1 |
On Wednesday 17 November 2010 21:55:28 Alan McKinnon wrote: |
2 |
> Apparently, though unproven, at 23:00 on Wednesday 17 November 2010, Paul |
3 |
> |
4 |
> Hartman did opine thusly: |
5 |
> > On Wed, Nov 17, 2010 at 2:35 PM, Mick <michaelkintzios@×××××.com> wrote: |
6 |
> > > Why is the second time so much faster? The size of the derived db was |
7 |
> > > the same on both occasions. |
8 |
> > |
9 |
> > I guess caching like Volker said too. What happens if you do something |
10 |
> > like this twice: |
11 |
> > |
12 |
> > sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
13 |
> |
14 |
> Now I'm intrigued. I did some quick and nasty tests. |
15 |
> |
16 |
> First, mlocate's updatedb. No measures taken to invalidate caches etc: |
17 |
> |
18 |
> # time updatedb |
19 |
> real 0m39.265s |
20 |
> user 0m2.245s |
21 |
> sys 0m0.228s |
22 |
> |
23 |
> |
24 |
> Then unmerge mlocate, emerge slocate, delete all dbs, run slocate's |
25 |
> updatedb twice: |
26 |
> |
27 |
> # rm /var/lib/[ms]locate/*db |
28 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
29 |
> real 1m35.365s |
30 |
> user 0m5.941s |
31 |
> sys 0m0.383s |
32 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
33 |
> real 1m34.929s |
34 |
> user 0m5.925s |
35 |
> sys 0m0.377s |
36 |
> |
37 |
> slocate seems quicker than the few tests I'd already done with mlocate and |
38 |
> has no optimizations to re-use existing correct data in the db. Now |
39 |
> unmerge slocate, merge mlocate, do not delete dbs and run mlocate's |
40 |
> updatedb twice: |
41 |
> |
42 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
43 |
> real 3m50.574s |
44 |
> user 0m7.277s |
45 |
> sys 0m0.361s |
46 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
47 |
> real 1m5.830s |
48 |
> user 0m2.088s |
49 |
> sys 0m0.173s |
50 |
> |
51 |
> Second run definitely quicker as it only has to read the fs, not write the |
52 |
> entire index as well. But that initial run ... The old slocate db was still |
53 |
> around, possibly affecting the first run, so delete both db's and run |
54 |
> mlocate's updatedb twice: |
55 |
> |
56 |
> # rm /var/lib/[ms]locate/*db |
57 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
58 |
> real 3m51.592s |
59 |
> user 0m7.249s |
60 |
> sys 0m0.350s |
61 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
62 |
> real 1m7.662s |
63 |
> user 0m1.997s |
64 |
> sys 0m0.159s |
65 |
> |
66 |
> Almost identical to the prior test, so the presence of slocate's db has no |
67 |
> effect on mlocate. Then I realized I hadn't measured how long they took to |
68 |
> reindex a largely cache'd fs so I tried that with both, deleting the db's |
69 |
> at each test: |
70 |
> |
71 |
> slocate: |
72 |
> # rm /var/lib/[ms]locate/*db |
73 |
> rm: cannot remove `/var/lib/[ms]locate/*db': No such file or directory |
74 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
75 |
> real 1m34.341s |
76 |
> user 0m5.929s |
77 |
> sys 0m0.397s |
78 |
> # time updatedb |
79 |
> real 0m2.454s |
80 |
> user 0m0.855s |
81 |
> sys 0m1.569s |
82 |
> |
83 |
> mlocate: |
84 |
> # rm /var/lib/[ms]locate/*db |
85 |
> # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb |
86 |
> real 3m54.792s |
87 |
> user 0m7.215s |
88 |
> sys 0m0.350s |
89 |
> # time updatedb |
90 |
> real 0m0.538s |
91 |
> user 0m0.302s |
92 |
> sys 0m0.232s |
93 |
> |
94 |
> 0.5 second vs 2.5 seconds. Wow. |
95 |
> |
96 |
> Conclusions: |
97 |
> |
98 |
> 1. mlocate is slow at building it's db from scratch - about 250% as long as |
99 |
> slocate on the same task. |
100 |
> 2. mlocate is faster at reindexing a largely-unchanged fs - it does it in |
101 |
> about 66% of the time slocate took. |
102 |
> 3. mlocate is insanely quick at reindexing a db that is in cache. |
103 |
> |
104 |
> #1 is are - most systems will only do it once |
105 |
> #3 is silly and does not represent anything close to reality |
106 |
> #2 is pretty realistic and a 33% performance boost is significant |
107 |
> |
108 |
> I have no idea where the speed increase in #3 comes from. This is an ext4 |
109 |
> fs - does ext4 keep an in-memory hash of inodes it reads? It seems to me |
110 |
> that would be a very clever and very useful thing for an fs to do. |
111 |
|
112 |
No. 3 is what made me sent my first post. I was almost convinced that I did |
113 |
something wrong, because no sooner had I hit return it completed. |
114 |
|
115 |
I've deleted the database and rebooted. This is what I'm getting now on the |
116 |
first run: |
117 |
|
118 |
# time updatedb |
119 |
|
120 |
real 2m30.729s |
121 |
user 0m0.723s |
122 |
sys 0m9.070s |
123 |
|
124 |
My database is small, this is a relatively slim installation: |
125 |
|
126 |
# ls -la /var/lib/mlocate/mlocate.db |
127 |
-rw-r----- 1 root locate 9326688 Nov 17 22:14 /var/lib/mlocate/mlocate.db |
128 |
-- |
129 |
Regards, |
130 |
Mick |