1 |
On 01/14/2010 10:21 PM, Nirbheek Chauhan wrote: |
2 |
> On Fri, Jan 15, 2010 at 2:01 AM, Daniel Bradshaw<daniel@×××××××××××.uk> wrote: |
3 |
> |
4 |
>> On 01/14/2010 12:49 PM, Nirbheek Chauhan wrote: |
5 |
>> |
6 |
>>> In theory, yes. In practice, git is too slow to handle 30,000 files. |
7 |
>>> Even simple operations like git add become painful even if you put the |
8 |
>>> whole of portage on tmpfs since git does a stat() on every single file |
9 |
>>> in the repository with every operation. |
10 |
>>> |
11 |
>>> |
12 |
>> My understanding is that git was developed as the SCM for the kernel |
13 |
>> project. |
14 |
>> A quick check in an arbitary untouched kernel in /usr/src/ suggests a file |
15 |
>> [1] count of 25300. |
16 |
>> |
17 |
>> Assuming that my figure isn't out by an order of magnitude, how does the |
18 |
>> kernel team get along with git and 25k files but it is deathly slow for our |
19 |
>> 30k? |
20 |
>> Or, to phrase the question better... what are they doing that allows them to |
21 |
>> manage? |
22 |
>> |
23 |
>> |
24 |
> My bad. I did the tests a while back, and the number "30,000" is |
25 |
> actually for the no. of ebuilds in portage. The no. of files is |
26 |
> actually ~113,000 (difference comes because every package has a |
27 |
> manifest+changelog+metadata.xml+patches). OTOH, the no. of directories |
28 |
> is "just" ~20,000, so if git would only do a stat() on directories, |
29 |
> it would get into the "usable" circle. |
30 |
> |
31 |
> Also, since git does a stat on directories as well as files, you can |
32 |
> say that every command has to do ~133,000 stats, which is damn slow |
33 |
> even when cached. |
34 |
> |
35 |
> |
36 |
|
37 |
Ah, so one side is off by a fair bit. |
38 |
Thanks for the clarification. |
39 |
|
40 |
Regards, |
41 |
Daniel |