Gentoo Archives: gentoo-dev

From: Richard Freeman <rich0@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Re: How not to discuss
Date: Sun, 31 May 2009 01:57:50
Message-Id: 4A21E40A.60500@gentoo.org
In Reply to: [gentoo-dev] Re: How not to discuss by Ryan Hill
1 Ryan Hill wrote:
2 > I'm tired of playing, as I'm sure you are. So please,
3 > let's be quiet now, and let the big people talk.
4 >
5
6 This is a public list designed to facilitate discussion of gentoo
7 software development. Anybody with something constructive to say is
8 more than welcome to speak up - particularly gentoo staff.
9
10 I don't pretend to be an expert on package management. However, hiding
11 internal implementation details is just good design. I can see how
12 putting eapi in the filename can be a convenience to the package
13 manager, but it still seems like a bad design, as it exposes end users
14 to an implementation detail of the package management system.
15
16 There are lots of ways that EAPI could be cached that would avoid the
17 various penalties that have been referred to. Even without an improved
18 cache the penalty seems superior to accepting the design compromise of
19 EAPI in the filename.
20
21 As to how EAPI could be cached goes - I could think of a few high-level
22 design options:
23
24 1. Cache files are distributed with the portage tree. EAPIs that break
25 the cache format would use different files that older package managers
26 would ignore. Downside is that it doesn't handle user-modified ebuilds
27 (unless the user tells the package manager to regenerate the cache), and
28 it doesn't handle overlays unless the maintainer generates the cache.
29
30 2. Cache files are generated when the tree is synced. The package
31 manager would look at the list of modified files and scan only those
32 files one time to index them. The index could contain the mtime and
33 path of the file. Then, when you perform an operation the package
34 manager could check the mtimes in the directories containing those files
35 and see if anything was touched and regenerate the cache if needed.
36 This takes a little more time during syncing but I suspect that it would
37 perform very well - after all after a sync all those files would be in
38 the disk cache anyway. A suitably clever package manager could read the
39 files as they are being synced and guarantee they are in-memory.
40
41 If we were talking about a 300TB table that got 300k transactions per
42 second I could see why we'd be talking about hacks to sacrifice
43 normalized design for speed. We're talking about a package database -
44 one that contains < 150k records. Sacrificing good design for speed
45 (instead of improving the algorithm) is a short term gain for a
46 long-term cost.

Replies

Subject Author
Re: [gentoo-dev] Re: How not to discuss Thilo Bangert <bangert@g.o>