Gentoo Archives: gentoo-dev

From: Denis Dupeyron <calchan@g.o>
To: gentoo-dev@l.g.o
Cc: ferringb@×××××.com
Subject: Re: [gentoo-dev] adding a modification timestamp to the installed pkgs database (vdb)
Date: Tue, 12 Jan 2010 00:12:21
Message-Id: 7c612fc61001111435s5caaf80dx4629d81447ab52e8@mail.gmail.com
In Reply to: [gentoo-dev] adding a modification timestamp to the installed pkgs database (vdb) by Brian Harring
1 Brian,
2
3 On Sun, Oct 25, 2009 at 6:50 PM, Brian Harring <ferringb@×××××.com> wrote:
4 > The proposal is pretty simple; if code modifies the vdb in any
5 > fashion, it needs to update the mtime on a file named
6 > '.modification_time' in the root of the vdb.
7 >
8 > For example-
9 >
10 > 1) ${PACKAGE_MANAGER} fires ups, builds a pkg.  it's now ready to
11 > install it.
12 > 2) this step isn't strictly required, but is a zero cost safety
13 > measure- prior to modifying the vdb, it updates the timestamp.  The
14 > reason for doing this is to protect against the manager blowing up in
15 > some fashion and now updating the timestamp- there still is a window
16 > if the manager breaks down during merging but it's far reduced.
17 > 3) manager does it's thing to the livefs, and to the vdb.
18 > 4) once finished, again, updates the timestamp.
19 >
20 > This isn't an incredibly complex change.  What it enables however is
21 > package managers to get serious about optimizing access to the vdb.
22 > For example for the 3 managers:
23 >
24 > paludis:
25 >  installed-cache currently needs to be manually ran by the user;
26 > specifically, the user is responsible for regenerating this cache if
27 > they use a non paludis manager to modify the VDB.  This can be
28 > automated via checking the vdb timestamp against a stored copy of the
29 > the vdb timestamp at the time of the cache generation.
30 >
31 > portage:
32 >  portage maintains a set of denormalized caches of the vdb- it however
33 > has to do validation of those caches on each access, meaning quite a
34 > few stats.  Same thing, can compare timestamp from current vdb to when
35 > it was generated to identify if it is no longer authorative.
36 >
37 > pkgcore:
38 >  pkgcore maintains a denormalized old style virtuals cache- same thing
39 > w/ portage, it has to do validation (stat'ing) whenever it uses that
40 > cache to ensure the data is accurate.  Same thing, can compare
41 > timestamp from current vdb to whenit was generated to identify if it
42 > is no longer authorative.
43 >
44 > The existing vdb caching could all be modified to use this timestamp.
45 > One stat in the best (common) case, instead of having to either scan
46 > the whole vdb each time or doing a subset of stats.
47 >
48 > This change enables further caching/denormalization of the vdb data
49 > while maintaining the old format- basically, it allows the manager to
50 > build out a helluva lot faster access to the vdb while keeping on
51 > disk compatibility in /var/db/pkg.
52 >
53 >
54 > Now unfortunately since the vdb is not format versioned in any
55 > fashion, to get this timestamp we have to do the following-
56 >
57 > 1) nudge everyone who has code poking into the vdb to update their
58 > code to update the timestamp
59 > 2) sit on our hands for N months until such time we've deemed
60 > "everyone we care about has upgraded"
61 > 3) push out a new release, and start pushing out versions of the
62 > managers/vdb consumers that use this timestamp instead of just
63 > updating it.
64 >
65 > For anyone who has been around gentoo for a couple of years, this is a
66 > pretty familiar pattern- eapi, profile changes, etc, all go through
67 > this unfortunately.
68 >
69 >
70 > That's the core of the proposal; there is a ticket open
71 > ( http://bugs.gentoo.org/290428 ) regarding this although there is
72 > some debate from ciaran which I'll try to now summarize, along w/ the
73 > counterarguments.
74 >
75 > 1) do a new vdb.
76 > Counter: this mechanism provides a way to synchronize the new vdb
77 > while maintaining the old during it's transition period, so this is
78 > needed anyways.  Further, pinning all of our optimization hopes on a
79 > new vdb is daft- it's been discussed for 5+ years now and still
80 > hasn't materialized (pkgcore has been able to have a new vdb for
81 > several years, but without a synchronization mechanism it would
82 > require locking users into the new format and locking out old
83 > consumers of the vdb- an unfriendly choice to push on users, hence
84 > never being implemented).
85 >
86 > 2) code that hasn't been updated to adjust the timestamp, but is still
87 > in use after the transition period will break things.
88 >  Counter: nature of any modification of this sort, frankly the gains
89 > outweight the costs of users being rediculously out of date.  Not
90 > saying it's perfect, but until someone comes up with a proposal that
91 > versions every PMS component (meaning PMS has to start documenting
92 > the VDB), it's what we have if we wish to move forward in
93 > refactoring.
94 >
95 > 3) the correct approach is to require users to tell each manager that
96 > changes have occured outside it's purview (run paludis
97 > --regenerate-installed-cache after every time you invoke pmerge or
98 > emerge).
99 >  Counter: that's rather unfriendly to users, and isn't what
100 > pkgcore/portage do.  Further, it's historically the opposite of the
101 > norm- consider the ebuild cache (we do validation as we go there,
102 > instead of expecting users to do a emerge --regen everytime they
103 > modify an ebuild).
104 >
105 >
106 > That's roughly the three points raised; there is some minor quibbling
107 > that mtime cannot be trusted, but that's mostly a variation of #2.
108
109 This looks to me like a good idea. I see some of it at least has been
110 implemented in portage and I would suspect in pkgcore too. However
111 it's not obvious to me that all the code is ready, and I don't see any
112 real specs, docs, etc... You're a seasoned slacker^Wdeveloper so you
113 know the drill. I will add this as a topic for the open floor
114 discussion for january but don't expect us to vote on it before we
115 have all of the above. Now, it might be that this whole thing is held
116 back by a more philosophical question in which case feel free to
117 propose it for addition to the (preferably february) agenda.
118
119 I'm a bit surprised by the low amount of discussions this topic has
120 generated. I know there is a bug about this and that there was some
121 action there, but still. I think that getting the above material ready
122 (specs, doc, PMS?, whatever) has a good chance of triggering
123 additional discussions.
124
125 Feel free to contact me in case you need help.
126 Denis.

Replies

Subject Author
Re: [gentoo-dev] adding a modification timestamp to the installed pkgs database (vdb) Ciaran McCreesh <ciaran.mccreesh@××××××××××.com>