Gentoo Archives: gentoo-dev

From:	Brian Harring <ferringb@×××××.com>
To:	Micha?? G??rny <mgorny@g.o>
Cc:	gentoo-dev@l.g.o, axs@g.o
Subject:	Re: [gentoo-dev] example conversion of gentoo-x86 current deps to unified dependencies
Date:	Sun, 16 Sep 2012 11:11:35
Message-Id:	`20120916111001.GK28593@localhost`
In Reply to:	Re: [gentoo-dev] example conversion of gentoo-x86 current deps to unified dependencies by "Michał Górny"

1	On Sun, Sep 16, 2012 at 09:56:27AM +0200, Micha?? G??rny wrote:
2	> But consider that for example Zac & AxS (correct me if I recall it
3	> correctly) considered making changing the meaning of RDEPEND to install
4	> them before the build, thus effectively making 'build,run' useless.
5
6	I really am not trying to be a blatant dick to you, but this has
7	/zero/ relevance. RDEPEND means "required for runtime". That ain't
8	changing. If they were discussing changing what RDEPEND meant, then
9	they were high, period.
10
11	If zac/axs want to try and make the resolver install RDEPEND before
12	DEPEND... well, they're free to. That doesn't change the fact that
13	the deps still must be specified correctly; in short, build,run is
14	very much relevant.
15
16	What I suspect they were intending on doing is letting the resolver
17	work on RDEPENDS of a pkg in parallel to that pkg being built; this is
18	a parallelization scheduling optimization, still requires accurate
19	deps.
20
21	I'm trying to be nice here, but you're very confused on this matter.
22
23
24	> > Total cache savings from doing this for a full tree conversion, for
25	> > our existing md5-cache format is 2.73MB (90 byes per cache entry).
26	> > Calculating the savings from the ebuild/eclass standpoint is
27	> > dependent on how the deps are built up, so I skipped that.
28	>
29	> You're storing the cache in a tarball?
30
31	Going to assume you're not trolling, and instead use this as a
32	way to point out that this actually does matter, although it's
33	admittedly not obvious if you don't know much about the guts of
34	package managers, or don't spend your saturday nights doing fun
35	things like optimizing ebuild package manager performance.
36
37	First, the figure is 3.204MB if default context is used; ~9.5% of the
38	content footprint for md5-cache specifically.
39
40	Little known fact; rsync transfers for gentoo are required to be
41	--whole-file; meaning no intra-file delta compression, it transfers
42	the whole file itself. This is done to keep cpu load on rsync nodes
43	low (else they'd be calculating minimally 97k md4's for every sync,
44	not counting the rolling adl32 chksum for all content dependent on
45	the window cut off threshold- sounds minor, but it's death by a
46	thousand cuts).
47
48	For obvious reasons, the cache is the hottest part of the tree due to
49	cascading updates due to eclass changes. In other words, that ~9.5%
50	reduction targest the core data actually transferered in a sync.
51
52	In terms of the total tree footprint, it's a 1% reduction; mostly lost
53	in blocksize overhead unless you're using squashfs (which a decent
54	number of folks do for speed reasons), or use tail packing FS for the
55	tree (again, more than you'd think- known primarily due to reiserfs
56	corruption bugs causing some hell on PM caches).
57
58	There's also the fact doing this means best case, 2 less inodes per
59	VDB entry (more once we start adding dependency types). For my vdb, I
60	have 15523 across 798 pkgs. 1331 of that is *DEPEND, converted to
61	DEPENDENCIES the file count is 748. Note that's preserving DEPEND,
62	although it's worthless at this stage of the vdb. So 5% reduction in
63	files in there. Whoopy-de-doo, right?
64
65	This one I can't test as well since the only rotational media I've got
66	these days is a hardware raid w/ a beefy cache; the closest I can
67	manage is local network nfs to an ssd FS, so it'll have to serve
68	as a stand in for cold cache/hot cache, and for a demonstration of
69	why having a backend that is a 101 small individual files is bad.
70
71	Best of 5 is displayed below:
72
73	Iterating over the vdb, and parsing and rendering all depends for our
74	current layout, w/ the vdb stored on nfs:
75
76	cold cache:
77	real 0m30.405s
78	user 0m1.046s
79	sys 0m0.390s
80
81	hot cache:
82	real 0m16.483s
83	user 0m0.883s
84	sys 0m0.168s
85
86	non-optimized, hacked to work (known slower for parsing in comparison
87	to the non quicky hack), iterating over the vdb, parsing all
88	depends and rendering said depends when it's stored as DEPENDENCIES;
89	literally, rendering DEPEND from it, RDEPEND, PDEPEND.
90
91	cold cache:
92	real 0m18.329s
93	user 0m0.908s
94	sys 0m0.280s
95
96	hot cache
97	real 0m12.185s
98	user 0m0.860s
99	sys 0m0.128s
100
101
102	You get the idea. See the various infamous cold cache/hot cache
103	performance tests in doubt; I can tell you that a similar trick, done
104	in '07, literally just skipping loading USE till it was needed for
105	provides parsing was enough to bring a 5400RPM drive's run time
106	down from 15s to 12s for cold cache- for parsing provides alone,
107	nothing else. Either way, do your own investigation, it's a
108	good education on performance.
109
110
111	Hopefully for the others listening, that last section was a random but
112	useful tidbit of info; if not, pardon, just being through to make sure
113	this point is not raised again.
114
115	~harring

Replies

Subject	Author
Re: [gentoo-dev] example conversion of gentoo-x86 current deps to unified dependencies	"Michał Górny" <mgorny@g.o>

Report Message

Find on MARC Find on Google Groups