Gentoo Archives: gentoo-soc

From: "Domen Kožar" <domen@×××.si>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] Project Grumpy - weekly report #1
Date: Tue, 01 Jun 2010 07:19:08
Message-Id: 1275376741.5272.77.camel@oblak.fubar.si
In Reply to: [gentoo-soc] Project Grumpy - weekly report #1 by Priit Laes
1 Also, good luck with thermodynamics;)
2
3 On Tue, 2010-06-01 at 10:10 +0300, Priit Laes wrote:
4 > This is a weekly progress report no. 1 for Project Grumpy.
5 >
6 > As this is the first publicly visible announcement, I am also going to
7 > give a short overview about the project itself.
8 >
9 > The aim of this project is to create a database containing various
10 > developer-related metadata about packages in the Gentoo portage.
11 > Metadata that we are going to store can be used for different kinds of
12 > purposes, some examples include upstream version checks and giving
13 > notifications to developers who are interested about that package. And
14 > eventually provide a nice web and API interface to access this data.
15 >
16 > Project's semi-official IRC channel is #gentoo-grumpy on Freenode
17 > network. Just step in say "Hi!" :)
18 >
19 > Last week's progress report
20 > ===========================
21 >
22 > My first week went a bit slowly due to having some "unfinished business"
23 > that I needed to finish, and also because of two exams (which went
24 > fine).
25 >
26 > The core issue I wrestled during this week was how to keep portage
27 > contents and database contents in sync - ie. when ebuild is modified,
28 > removed or added, how to make sure that database contents correspond to
29 > the portage contents.
30 >
31 > The solution that I came up with is to use a simple daemon that logs
32 > changes to portage tree and modifies database contents when it's
33 > appropriate. Appropriate here means that we shouldn't log updates during
34 > the update of the tree as it might be unsafe (ie package rename). So
35 > currently it seems that daemon has also initiate the rsync progress and
36 > push the updates into database after rsync has finished successfully.
37 > (You can already see how all kinds of weird corner cases start popping
38 > up :P )
39 >
40 > My current approach to logging is using the inotify [1] framework
41 > present in Linux kernel since 2.6.13 (sorry BSD users, but this is
42 > Gentoo Linux afterall) with the help of pyinotify [2].
43 > So far there's only one drawback to using inotify - by default kernel
44 > has a limit of 8192 directory watches allowed per-process (but portage
45 > contains a lots of directories) so in order to use that approach one has
46 > to bump the number watches using /proc/sys/fs/inotify/max_user_watches
47 > tunable. 81920 has worked so far fine on my machine ;)
48 >
49 > There was also a secondary approach suggested by my mentor Leio to parse
50 > rsync log files, but I am a bit relucant about this idea.
51 >
52 > Anyway, I'll leave this idea simmering here for a while and unless
53 > someone comes up with a better idea (Yes, I have also thought about
54 > scanning whole portage tree every x-hours), I'm going to implement the
55 > daemon.
56 >
57 > Plans for current week
58 > ======================
59 >
60 > As I currently consider the core issue solved, the next issue I have to
61 > solve is how to take an ebuild, extract information about it and store
62 > it in database. (Hint: pkgcore)
63 >
64 > I'm not going take bigger tasks because I still have one quite hard exam
65 > (thermodynamics and statistical physics) on 4th of June. And if I pass,
66 > it is the last one.
67 >
68 > PS. Sorry, no blog yet. I was using Zine, but it broke after I updated
69 > my system to SQLAlchemy-0.6.
70 >
71 > [1] http://en.wikipedia.org/wiki/Inotify
72 > [2] http://trac.dbzteam.org/pyinotify
73 >
74 > Päikest,
75 > Priit Laes :)
76 >

Attachments

File name MIME type
signature.asc application/pgp-signature