1 |
On Fri, Jan 15, 2010 at 2:01 AM, Daniel Bradshaw <daniel@×××××××××××.uk> wrote: |
2 |
> On 01/14/2010 12:49 PM, Nirbheek Chauhan wrote: |
3 |
>> |
4 |
>> In theory, yes. In practice, git is too slow to handle 30,000 files. |
5 |
>> Even simple operations like git add become painful even if you put the |
6 |
>> whole of portage on tmpfs since git does a stat() on every single file |
7 |
>> in the repository with every operation. |
8 |
>> |
9 |
> |
10 |
> My understanding is that git was developed as the SCM for the kernel |
11 |
> project. |
12 |
> A quick check in an arbitary untouched kernel in /usr/src/ suggests a file |
13 |
> [1] count of 25300. |
14 |
> |
15 |
> Assuming that my figure isn't out by an order of magnitude, how does the |
16 |
> kernel team get along with git and 25k files but it is deathly slow for our |
17 |
> 30k? |
18 |
> Or, to phrase the question better... what are they doing that allows them to |
19 |
> manage? |
20 |
> |
21 |
|
22 |
My bad. I did the tests a while back, and the number "30,000" is |
23 |
actually for the no. of ebuilds in portage. The no. of files is |
24 |
actually ~113,000 (difference comes because every package has a |
25 |
manifest+changelog+metadata.xml+patches). OTOH, the no. of directories |
26 |
is "just" ~20,000, so if git would only do a stat() on directories, |
27 |
it would get into the "usable" circle. |
28 |
|
29 |
Also, since git does a stat on directories as well as files, you can |
30 |
say that every command has to do ~133,000 stats, which is damn slow |
31 |
even when cached. |
32 |
|
33 |
-- |
34 |
~Nirbheek Chauhan |
35 |
|
36 |
Gentoo GNOME+Mozilla Team |