1 |
On Mon, Sep 19, 2011 at 9:35 AM, Dirkjan Ochtman <djc@g.o> wrote: |
2 |
> On Mon, Sep 19, 2011 at 00:27, "Paweł Hajdan, Jr." |
3 |
> <phajdan.jr@g.o> wrote: |
4 |
>> Okay, I think this is pretty cool and we should find it a new home in |
5 |
>> the Gentoo infrastructure. |
6 |
>> |
7 |
>> I was thinking about http://qa-reports.gentoo.org/ with the repo at |
8 |
>> http://git.overlays.gentoo.org/gitweb/?p=proj/qa-scripts.git;a=summary |
9 |
>> |
10 |
>> I can act as a proxy committer and reviewer for that code. Could you |
11 |
>> break it up into some smaller parts (preferably backend first) and send |
12 |
>> to me for review (if you're interested)? |
13 |
>> |
14 |
>> How long does it take to generate the reports? |
15 |
> |
16 |
> +1 I think it would be good to run this on Gentoo infra, and I |
17 |
> wouldn't mind helping out. |
18 |
> |
19 |
> Bikeshedding: not sure "reports" is the best name for this, as reports |
20 |
> implies something more static? |
21 |
|
22 |
Here is how it works, each week I launch this script on lt server. |
23 |
I've got ~30 trees installed with layman. The server is an AMD X2 |
24 |
4600+ with 4GB of RAM and two 80GB HD in raid1 using ext4. My network |
25 |
bandwidth is 20Mbps down 1Mbps up. |
26 |
|
27 |
#!/bin/sh |
28 |
|
29 |
## Setup some vars to use local portage tree |
30 |
export PATH=${HOME}/euscan/bin:${PATH} |
31 |
export PYTHONPATH=${HOME}/euscan/pym:${PYTHONPATH} |
32 |
export PORTAGE_CONFIGROOT=${HOME}/local |
33 |
export ROOT=${HOME}/local |
34 |
export EIX_CACHEFILE=${HOME}/local/var/cache/eix |
35 |
|
36 |
## Go to euscanwww dir |
37 |
cd ${HOME}/euscan/euscanwww/ |
38 |
|
39 |
## Update local trees |
40 |
## Bottleneck: disk and network bandwidth |
41 |
## Time: less than 30mn |
42 |
emerge --sync --root=${ROOT} --config-root=${PORTAGE_CONFIGROOT} |
43 |
ROOT="/" layman -S --config=${ROOT}/etc/layman/layman.cfg |
44 |
|
45 |
## Also update eix database, because we use eix internaly |
46 |
## Bottleneck: disk and cpu |
47 |
##Time: 30mn ~ 1h |
48 |
eix-update |
49 |
|
50 |
## Scan portage (packages, versions) |
51 |
## Bottleneck: disk and cpu |
52 |
## Time: < 15mn |
53 |
## Note: this script uses eix to get a list of packages and versions |
54 |
python manage.py scan-portage --all --purge-versions --purge-packages |
55 |
|
56 |
## Scan metadata (herds, maintainers, homepages, ...) |
57 |
## Bottleneck: disk |
58 |
## Time: 1h ~ 1h30 |
59 |
## Note: this script uses gentoolkit to fetch metadata |
60 |
python manage.py scan-metadata --all --progress |
61 |
|
62 |
## Scan uptsream packages |
63 |
## Bottleneck: disk, network bandwidth and latency, cpu |
64 |
## Time: up to 6h |
65 |
## Note: euscan is called on each package. euscan has a slow startup |
66 |
caused by gentoolkit/portage. |
67 |
## gparallel is used here to limit the load caused by euscan, |
68 |
and to launch up to 16 euscan instances at a time on this machine |
69 |
## this part is the longest, but scale very well |
70 |
eix --only-names -x | gparallel --load 4 --jobs 800% euscan >> |
71 |
${HOME}/logs/euscan-upstream.log |
72 |
python manage.py scan-upstream --feed --purge-versions < |
73 |
${HOME}/logs/euscan-upstream.log |
74 |
|
75 |
## Update counters (6) |
76 |
## Time: some minutes |
77 |
## Bottleneck: cpu |
78 |
## Note: this script could probably be implemented faster using raw SQL queries |
79 |
python manage.py update-counters |
80 |
|
81 |
|
82 |
> Also not sure how much it has to do |
83 |
> with QA. |
84 |
> How much of it constitutes the backend, in your opinion? It seems |
85 |
> there are two parts, right now: |
86 |
> |
87 |
> 1. euscan script, to find new versions for a single package |
88 |
> 2. the django www app, including storage for the version data |
89 |
|
90 |
Yes, exactly. Here is how the tree is structured currently: |
91 |
|
92 |
euscan script |
93 |
|
94 |
bin/ -- contains the euscan python "binary" |
95 |
pym/ -- contains most of the code used by the euscan script |
96 |
pym/euscan/handlers -- contains specific site handlers (rubygems, |
97 |
pypi, pecl, pear, ..) |
98 |
|
99 |
euscanwww django app |
100 |
|
101 |
euscanwww/ -- contains all the stuff for the django application, all |
102 |
the django application needs is a working portage tree and euscan |
103 |
available in the $PATH |
104 |
|
105 |
> IMO it would be nice to have a somewhat generic REST-style service |
106 |
> exposing the data, and build a simple UI on top of that. In |
107 |
> particular, I have different ideas about what the UI should look like, |
108 |
> so it would be nice if different people could experiment (and/or |
109 |
> integrate in other services like znurt.org). |
110 |
|
111 |
I already added some very dummy json formating (note that it also |
112 |
exposes internal key id, which is bad, but I just wanted to |
113 |
experiment). |
114 |
All you need is to append "/json" to an url. For example: |
115 |
|
116 |
- http://euscan.iksaif.net/maintainers/4/json |
117 |
- http://euscan.iksaif.net/package/app-accessibility/brltty/json |
118 |
|
119 |
This could be a lot better, we just need to define an API and the |
120 |
implementation will be easy. |
121 |
|
122 |
|
123 |
A first step would be to make an ebuild for euscan, and another for |
124 |
euscanwww so that anyone can easilly install it and play with it. |
125 |
Feel free to ping me on irc, I'm on #gentoo-sunrise, my nickname is "iksaif". |
126 |
|
127 |
-- |
128 |
Corentin Chary |
129 |
http://xf.iksaif.net |