Gentoo Archives: gentoo-dev

From: Daniel Bradshaw <daniel@×××××××××××.uk>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] proxy maintainership and gentoo-x86 scm
Date: Thu, 14 Jan 2010 22:32:51
Message-Id: 4B4F9B80.4070506@the-cell.co.uk
In Reply to: Re: [gentoo-dev] proxy maintainership and gentoo-x86 scm by Nirbheek Chauhan
1 On 01/14/2010 10:21 PM, Nirbheek Chauhan wrote:
2 > On Fri, Jan 15, 2010 at 2:01 AM, Daniel Bradshaw<daniel@×××××××××××.uk> wrote:
3 >
4 >> On 01/14/2010 12:49 PM, Nirbheek Chauhan wrote:
5 >>
6 >>> In theory, yes. In practice, git is too slow to handle 30,000 files.
7 >>> Even simple operations like git add become painful even if you put the
8 >>> whole of portage on tmpfs since git does a stat() on every single file
9 >>> in the repository with every operation.
10 >>>
11 >>>
12 >> My understanding is that git was developed as the SCM for the kernel
13 >> project.
14 >> A quick check in an arbitary untouched kernel in /usr/src/ suggests a file
15 >> [1] count of 25300.
16 >>
17 >> Assuming that my figure isn't out by an order of magnitude, how does the
18 >> kernel team get along with git and 25k files but it is deathly slow for our
19 >> 30k?
20 >> Or, to phrase the question better... what are they doing that allows them to
21 >> manage?
22 >>
23 >>
24 > My bad. I did the tests a while back, and the number "30,000" is
25 > actually for the no. of ebuilds in portage. The no. of files is
26 > actually ~113,000 (difference comes because every package has a
27 > manifest+changelog+metadata.xml+patches). OTOH, the no. of directories
28 > is "just" ~20,000, so if git would only do a stat() on directories,
29 > it would get into the "usable" circle.
30 >
31 > Also, since git does a stat on directories as well as files, you can
32 > say that every command has to do ~133,000 stats, which is damn slow
33 > even when cached.
34 >
35 >
36
37 Ah, so one side is off by a fair bit.
38 Thanks for the clarification.
39
40 Regards,
41 Daniel