Gentoo Archives: gentoo-portage-dev

From: Brian Harring <ferringb@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] cache subsystem replacement
Date: Mon, 14 Nov 2005 16:31:59
Message-Id: 20051114163039.GC11268@nightcrawler
In Reply to: Re: [gentoo-portage-dev] cache subsystem replacement by Jason Stubbs
1 On Tue, Nov 15, 2005 at 01:13:58AM +0900, Jason Stubbs wrote:
2 > Was talking with a guy yesterday who mentioned he had 10 line patch that sped
3 > up current portage a lot with regard to updating metadata. I asked him to
4 > send it to me and here it is:
5 >
6 > --- -   2005-10-29 18:49:15.156173000 +0900
7 > +++ /usr/lib/portage/pym/portage_db_cpickle.py  2005-10-08 11:13:37.000000000
8 > +0900
9 > @@ -61,6 +61,9 @@
10 >                 return False
11 >                         
12 >         def sync(self):
13 > +               return
14 > +
15 > +       def realsync(self):
16 >                 if self.modified:
17 >                         try:
18 >                                 if os.path.exists(self.filename):
19 > @@ -74,6 +77,6 @@
20 >                                 pass
21 >         
22 >         def close(self):
23 > -               self.sync()
24 > +               self.realsync()
25 >                 self.db = None;
26
27 Ok, your mail client is screwing stuff up here ;)
28
29 The problem with the trick above is that, yeah, it delays syncs, but
30 it also means if portage shuts down uncleanly _ever_, the entire
31 eclass db of the old cache format is invalidated.
32 All of it.
33 Back to square one.
34 Massively bad thing, obviously.
35
36 This is why the default sync rate of cache classes in the rewrite is
37 1 also, it updates every time a change is pushed to it.
38
39 > I remembered seeing sync_rate when glancing through the new cache stuff and
40 > then had a look into mirror_cache(). Playing with trg_cache.sync(x), I got
41 > the following numbers.
42 >
43 > x total #1 total #2 total #3 median sys
44 > 1 13.651 13.451 13.727 2.712
45 > 10 13.413 13.412 13.645 2.538
46 > 100 13.605 13.498 13.405 2.700
47 > 1000 13.673 13.726 13.748 2.839
48 > 10000 14.541 14.054 13.447 2.743
49 > 100000 13.973 13.951 14.512 2.881
50 > 1000000 13.583 13.622 13.935 2.669
51 >
52 > Command run was:
53 >
54 > rm -rf /var/cache/edb/dep/*; time emerge -q metadata
55 >
56 > So what does changing the sync_rate actually do? Ease seeks? Should I re-run
57 > these tests with a reboot in between? (And what happened to the 4 seconds I
58 > was getting with earlier patches? Bug fixes turn quantity into quality? :)
59
60 Umm... 4 seconds? Eh?
61
62 Regarding what the sync_rate does, if the target cache supports
63 batched updates (think rdbms), it is capable of delaying upto N
64 modifications prior to pushing the change out.
65
66 a cdb/cpickle cache backend would want to use this fex.
67
68 Meanwhile, why you're not seeing any variation- I'm pretty much
69 positive you're using a cache that autocommits, meaning delayed
70 sync'ing isn't possible. Autocommit == can't batch, so sync rate
71 isn't used/valid.
72
73 The only cache in the rewrite that doesn't autocommit is the sqlite
74 implementation (which coincidentally is why sync rate exists; inserts
75 into sqlite are !@*#!@#*ing slow).
76 ~harring