1 |
Hiya Emma, |
2 |
Good luck on your project. A couple of things to be weary of are disk |
3 |
I/O, metadata cache backends and overlays. |
4 |
Disk I/O can be a significant bottleneck. Loading up a lot of files |
5 |
from disk (be it the metadata cache or whatever) can take a long time |
6 |
initially, but then be cached in RAM and so be much faster to access in |
7 |
the future. |
8 |
Portage allows for its internal metadata cache to be stored in a |
9 |
variety of formats, as long as there's a backend to support it. This |
10 |
means simple speedups can be achieved using cdb or sqlite (if you google |
11 |
these and portage you'll get gentoo-wiki tips, which unfortunately |
12 |
you'll have to read from google's cache at the moment). It also means |
13 |
that if you want to make use of this metadata from within portage, |
14 |
you'll have to rely on the API to tell the backend to get you all the |
15 |
data (and it may be difficult to speed up without writing your own backend). |
16 |
Finally there are overlays, and since these can change outside of an |
17 |
"emerge --sync" (as indeed can the main tree), you'll have to reindex |
18 |
these before each search request, or give the user stale data until they |
19 |
manually reindex. |
20 |
If you're interesting in implementing this in python, you may be |
21 |
interested in another package manager that can handle the main tree, |
22 |
also implemented in python, called pkgcore. From what I understand, |
23 |
it's a similar code-base to portage, but its internal architecture may |
24 |
have changed a lot. |
25 |
I hope some of that helps, and isn't off putting. I look forward to |
26 |
seeing the results! 5:) |
27 |
Mike 5:) |