1 |
Brian, |
2 |
|
3 |
On Sun, Oct 25, 2009 at 6:50 PM, Brian Harring <ferringb@×××××.com> wrote: |
4 |
> The proposal is pretty simple; if code modifies the vdb in any |
5 |
> fashion, it needs to update the mtime on a file named |
6 |
> '.modification_time' in the root of the vdb. |
7 |
> |
8 |
> For example- |
9 |
> |
10 |
> 1) ${PACKAGE_MANAGER} fires ups, builds a pkg. it's now ready to |
11 |
> install it. |
12 |
> 2) this step isn't strictly required, but is a zero cost safety |
13 |
> measure- prior to modifying the vdb, it updates the timestamp. The |
14 |
> reason for doing this is to protect against the manager blowing up in |
15 |
> some fashion and now updating the timestamp- there still is a window |
16 |
> if the manager breaks down during merging but it's far reduced. |
17 |
> 3) manager does it's thing to the livefs, and to the vdb. |
18 |
> 4) once finished, again, updates the timestamp. |
19 |
> |
20 |
> This isn't an incredibly complex change. What it enables however is |
21 |
> package managers to get serious about optimizing access to the vdb. |
22 |
> For example for the 3 managers: |
23 |
> |
24 |
> paludis: |
25 |
> installed-cache currently needs to be manually ran by the user; |
26 |
> specifically, the user is responsible for regenerating this cache if |
27 |
> they use a non paludis manager to modify the VDB. This can be |
28 |
> automated via checking the vdb timestamp against a stored copy of the |
29 |
> the vdb timestamp at the time of the cache generation. |
30 |
> |
31 |
> portage: |
32 |
> portage maintains a set of denormalized caches of the vdb- it however |
33 |
> has to do validation of those caches on each access, meaning quite a |
34 |
> few stats. Same thing, can compare timestamp from current vdb to when |
35 |
> it was generated to identify if it is no longer authorative. |
36 |
> |
37 |
> pkgcore: |
38 |
> pkgcore maintains a denormalized old style virtuals cache- same thing |
39 |
> w/ portage, it has to do validation (stat'ing) whenever it uses that |
40 |
> cache to ensure the data is accurate. Same thing, can compare |
41 |
> timestamp from current vdb to whenit was generated to identify if it |
42 |
> is no longer authorative. |
43 |
> |
44 |
> The existing vdb caching could all be modified to use this timestamp. |
45 |
> One stat in the best (common) case, instead of having to either scan |
46 |
> the whole vdb each time or doing a subset of stats. |
47 |
> |
48 |
> This change enables further caching/denormalization of the vdb data |
49 |
> while maintaining the old format- basically, it allows the manager to |
50 |
> build out a helluva lot faster access to the vdb while keeping on |
51 |
> disk compatibility in /var/db/pkg. |
52 |
> |
53 |
> |
54 |
> Now unfortunately since the vdb is not format versioned in any |
55 |
> fashion, to get this timestamp we have to do the following- |
56 |
> |
57 |
> 1) nudge everyone who has code poking into the vdb to update their |
58 |
> code to update the timestamp |
59 |
> 2) sit on our hands for N months until such time we've deemed |
60 |
> "everyone we care about has upgraded" |
61 |
> 3) push out a new release, and start pushing out versions of the |
62 |
> managers/vdb consumers that use this timestamp instead of just |
63 |
> updating it. |
64 |
> |
65 |
> For anyone who has been around gentoo for a couple of years, this is a |
66 |
> pretty familiar pattern- eapi, profile changes, etc, all go through |
67 |
> this unfortunately. |
68 |
> |
69 |
> |
70 |
> That's the core of the proposal; there is a ticket open |
71 |
> ( http://bugs.gentoo.org/290428 ) regarding this although there is |
72 |
> some debate from ciaran which I'll try to now summarize, along w/ the |
73 |
> counterarguments. |
74 |
> |
75 |
> 1) do a new vdb. |
76 |
> Counter: this mechanism provides a way to synchronize the new vdb |
77 |
> while maintaining the old during it's transition period, so this is |
78 |
> needed anyways. Further, pinning all of our optimization hopes on a |
79 |
> new vdb is daft- it's been discussed for 5+ years now and still |
80 |
> hasn't materialized (pkgcore has been able to have a new vdb for |
81 |
> several years, but without a synchronization mechanism it would |
82 |
> require locking users into the new format and locking out old |
83 |
> consumers of the vdb- an unfriendly choice to push on users, hence |
84 |
> never being implemented). |
85 |
> |
86 |
> 2) code that hasn't been updated to adjust the timestamp, but is still |
87 |
> in use after the transition period will break things. |
88 |
> Counter: nature of any modification of this sort, frankly the gains |
89 |
> outweight the costs of users being rediculously out of date. Not |
90 |
> saying it's perfect, but until someone comes up with a proposal that |
91 |
> versions every PMS component (meaning PMS has to start documenting |
92 |
> the VDB), it's what we have if we wish to move forward in |
93 |
> refactoring. |
94 |
> |
95 |
> 3) the correct approach is to require users to tell each manager that |
96 |
> changes have occured outside it's purview (run paludis |
97 |
> --regenerate-installed-cache after every time you invoke pmerge or |
98 |
> emerge). |
99 |
> Counter: that's rather unfriendly to users, and isn't what |
100 |
> pkgcore/portage do. Further, it's historically the opposite of the |
101 |
> norm- consider the ebuild cache (we do validation as we go there, |
102 |
> instead of expecting users to do a emerge --regen everytime they |
103 |
> modify an ebuild). |
104 |
> |
105 |
> |
106 |
> That's roughly the three points raised; there is some minor quibbling |
107 |
> that mtime cannot be trusted, but that's mostly a variation of #2. |
108 |
|
109 |
This looks to me like a good idea. I see some of it at least has been |
110 |
implemented in portage and I would suspect in pkgcore too. However |
111 |
it's not obvious to me that all the code is ready, and I don't see any |
112 |
real specs, docs, etc... You're a seasoned slacker^Wdeveloper so you |
113 |
know the drill. I will add this as a topic for the open floor |
114 |
discussion for january but don't expect us to vote on it before we |
115 |
have all of the above. Now, it might be that this whole thing is held |
116 |
back by a more philosophical question in which case feel free to |
117 |
propose it for addition to the (preferably february) agenda. |
118 |
|
119 |
I'm a bit surprised by the low amount of discussions this topic has |
120 |
generated. I know there is a bug about this and that there was some |
121 |
action there, but still. I think that getting the above material ready |
122 |
(specs, doc, PMS?, whatever) has a good chance of triggering |
123 |
additional discussions. |
124 |
|
125 |
Feel free to contact me in case you need help. |
126 |
Denis. |