1 |
Hi to all! |
2 |
|
3 |
I've been implementing a filesystem in a box (in Python) for a project of mine |
4 |
(the filesystem in a box part is called PyVFS, btw., will be on SF soon |
5 |
enough). The filesystem part isn't production ready yet at the moment, but |
6 |
"works good enough" (TM) for me to entrust it to do backend storage for a |
7 |
web-script I write (that's the actual project I'm working on). |
8 |
|
9 |
I've seen several attempts out there to have Python store the portage DB in |
10 |
something other than the normal filesystem (which makes sense, as the portage |
11 |
DB's disk-usage compared to actual size is considerably different, and the |
12 |
portage DB is also quite compressible), but all of them were using something |
13 |
like an SQL database or other external modules which would require more |
14 |
things to be compiled during bootstrapping. |
15 |
|
16 |
The module I've written is completely written in Python, requires 2.3.x to run |
17 |
(I've only tested it on x=3), and has only moderate overhead. Basically it |
18 |
creates a real filesystem (ext2-like), for which you can set a considerably |
19 |
smaller block-size, in a file. Accessing the filesystem (loading/storing |
20 |
files) is done using an interface which closely resembles the standard Python |
21 |
way of working with files. |
22 |
|
23 |
As a test, I've loaded the portage DB into the filesystem, and the output was |
24 |
an astonishing size decrease from about 300MB (on a ReiserFS-partition with |
25 |
block-size 4KB) to about 120MB for the single file (filesystem created with a |
26 |
block-size of 512b). |
27 |
|
28 |
I have yet to implement compression for parts of the filesystem (this would |
29 |
decrease size even further), as this would mean that compressed parts would |
30 |
have to be completely preloaded into memory before access, but I'm planning |
31 |
on implementing something like attributes which can be set on a |
32 |
directory-basis (e.g. compress all sub-elements of this directory into a |
33 |
single virtual filesystem which gets loaded into memory, decompressed, and |
34 |
then mounted at that point of the tree transparently). |
35 |
|
36 |
I can already hear a lot of objections coming (for example what about rsync), |
37 |
but implementing an rsync-like protocol on top of this filesystem in Python |
38 |
is nothing that's undoable, even reimplementing rsync in Python (or at least |
39 |
the necessary subset of the protocol which is necessary to load the tree, as |
40 |
we're not doing uploads to the server, only downloads) shouldn't be a real |
41 |
problem. |
42 |
|
43 |
Now, what I'm posting for is just for asking if someone out there among the |
44 |
other gentoo developers has an interest in following this project with me, |
45 |
working on it together, or any of that. Feel free to mail me. |
46 |
|
47 |
As a side-note, I can't release PyVFS yet, as I'm bound to my employer on |
48 |
that, but I've had green lights to release it under LGPL sometime at the |
49 |
beginning of next week... I just need that little signature from by |
50 |
boss... :) |
51 |
|
52 |
Heiko Wundram. |
53 |
|
54 |
-- |
55 |
gentoo-dev@g.o mailing list |