Gentoo Archives: gentoo-dev

From: Corentin Chary <corentin.chary@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] euscan proof of concept (like debian's uscan)
Date: Mon, 19 Sep 2011 08:40:55
Message-Id: CAHR064h_B6WB9o7aCtZwObq5AwW1YUbY-MH8eb8W9WsbWi1eKA@mail.gmail.com
In Reply to: Re: [gentoo-dev] euscan proof of concept (like debian's uscan) by Dirkjan Ochtman
1 On Mon, Sep 19, 2011 at 9:35 AM, Dirkjan Ochtman <djc@g.o> wrote:
2 > On Mon, Sep 19, 2011 at 00:27, "Paweł Hajdan, Jr."
3 > <phajdan.jr@g.o> wrote:
4 >> Okay, I think this is pretty cool and we should find it a new home in
5 >> the Gentoo infrastructure.
6 >>
7 >> I was thinking about http://qa-reports.gentoo.org/ with the repo at
8 >> http://git.overlays.gentoo.org/gitweb/?p=proj/qa-scripts.git;a=summary
9 >>
10 >> I can act as a proxy committer and reviewer for that code. Could you
11 >> break it up into some smaller parts (preferably backend first) and send
12 >> to me for review (if you're interested)?
13 >>
14 >> How long does it take to generate the reports?
15 >
16 > +1 I think it would be good to run this on Gentoo infra, and I
17 > wouldn't mind helping out.
18 >
19 > Bikeshedding: not sure "reports" is the best name for this, as reports
20 > implies something more static?
21
22 Here is how it works, each week I launch this script on lt server.
23 I've got ~30 trees installed with layman. The server is an AMD X2
24 4600+ with 4GB of RAM and two 80GB HD in raid1 using ext4. My network
25 bandwidth is 20Mbps down 1Mbps up.
26
27 #!/bin/sh
28
29 ## Setup some vars to use local portage tree
30 export PATH=${HOME}/euscan/bin:${PATH}
31 export PYTHONPATH=${HOME}/euscan/pym:${PYTHONPATH}
32 export PORTAGE_CONFIGROOT=${HOME}/local
33 export ROOT=${HOME}/local
34 export EIX_CACHEFILE=${HOME}/local/var/cache/eix
35
36 ## Go to euscanwww dir
37 cd ${HOME}/euscan/euscanwww/
38
39 ## Update local trees
40 ## Bottleneck: disk and network bandwidth
41 ## Time: less than 30mn
42 emerge --sync --root=${ROOT} --config-root=${PORTAGE_CONFIGROOT}
43 ROOT="/" layman -S --config=${ROOT}/etc/layman/layman.cfg
44
45 ## Also update eix database, because we use eix internaly
46 ## Bottleneck: disk and cpu
47 ##Time: 30mn ~ 1h
48 eix-update
49
50 ## Scan portage (packages, versions)
51 ## Bottleneck: disk and cpu
52 ## Time: < 15mn
53 ## Note: this script uses eix to get a list of packages and versions
54 python manage.py scan-portage --all --purge-versions --purge-packages
55
56 ## Scan metadata (herds, maintainers, homepages, ...)
57 ## Bottleneck: disk
58 ## Time: 1h ~ 1h30
59 ## Note: this script uses gentoolkit to fetch metadata
60 python manage.py scan-metadata --all --progress
61
62 ## Scan uptsream packages
63 ## Bottleneck: disk, network bandwidth and latency, cpu
64 ## Time: up to 6h
65 ## Note: euscan is called on each package. euscan has a slow startup
66 caused by gentoolkit/portage.
67 ## gparallel is used here to limit the load caused by euscan,
68 and to launch up to 16 euscan instances at a time on this machine
69 ## this part is the longest, but scale very well
70 eix --only-names -x | gparallel --load 4 --jobs 800% euscan >>
71 ${HOME}/logs/euscan-upstream.log
72 python manage.py scan-upstream --feed --purge-versions <
73 ${HOME}/logs/euscan-upstream.log
74
75 ## Update counters (6)
76 ## Time: some minutes
77 ## Bottleneck: cpu
78 ## Note: this script could probably be implemented faster using raw SQL queries
79 python manage.py update-counters
80
81
82 > Also not sure how much it has to do
83 > with QA.
84 > How much of it constitutes the backend, in your opinion? It seems
85 > there are two parts, right now:
86 >
87 > 1. euscan script, to find new versions for a single package
88 > 2. the django www app, including storage for the version data
89
90 Yes, exactly. Here is how the tree is structured currently:
91
92 euscan script
93
94 bin/ -- contains the euscan python "binary"
95 pym/ -- contains most of the code used by the euscan script
96 pym/euscan/handlers -- contains specific site handlers (rubygems,
97 pypi, pecl, pear, ..)
98
99 euscanwww django app
100
101 euscanwww/ -- contains all the stuff for the django application, all
102 the django application needs is a working portage tree and euscan
103 available in the $PATH
104
105 > IMO it would be nice to have a somewhat generic REST-style service
106 > exposing the data, and build a simple UI on top of that. In
107 > particular, I have different ideas about what the UI should look like,
108 > so it would be nice if different people could experiment (and/or
109 > integrate in other services like znurt.org).
110
111 I already added some very dummy json formating (note that it also
112 exposes internal key id, which is bad, but I just wanted to
113 experiment).
114 All you need is to append "/json" to an url. For example:
115
116 - http://euscan.iksaif.net/maintainers/4/json
117 - http://euscan.iksaif.net/package/app-accessibility/brltty/json
118
119 This could be a lot better, we just need to define an API and the
120 implementation will be easy.
121
122
123 A first step would be to make an ebuild for euscan, and another for
124 euscanwww so that anyone can easilly install it and play with it.
125 Feel free to ping me on irc, I'm on #gentoo-sunrise, my nickname is "iksaif".
126
127 --
128 Corentin Chary
129 http://xf.iksaif.net

Replies

Subject Author
Re: [gentoo-dev] euscan proof of concept (like debian's uscan) "Michał Górny" <mgorny@g.o>