Gentoo Archives: gentoo-soc

From: Priit Laes <plaes@×××××.org>
To: gentoo-soc@l.g.o
Cc: leio@g.o, ferringb@g.o
Subject: [gentoo-soc] Project Grumpy - weekly report #1
Date: Tue, 01 Jun 2010 07:10:50
Message-Id: 1275376236.31167.11.camel@chi
1 This is a weekly progress report no. 1 for Project Grumpy.
2
3 As this is the first publicly visible announcement, I am also going to
4 give a short overview about the project itself.
5
6 The aim of this project is to create a database containing various
7 developer-related metadata about packages in the Gentoo portage.
8 Metadata that we are going to store can be used for different kinds of
9 purposes, some examples include upstream version checks and giving
10 notifications to developers who are interested about that package. And
11 eventually provide a nice web and API interface to access this data.
12
13 Project's semi-official IRC channel is #gentoo-grumpy on Freenode
14 network. Just step in say "Hi!" :)
15
16 Last week's progress report
17 ===========================
18
19 My first week went a bit slowly due to having some "unfinished business"
20 that I needed to finish, and also because of two exams (which went
21 fine).
22
23 The core issue I wrestled during this week was how to keep portage
24 contents and database contents in sync - ie. when ebuild is modified,
25 removed or added, how to make sure that database contents correspond to
26 the portage contents.
27
28 The solution that I came up with is to use a simple daemon that logs
29 changes to portage tree and modifies database contents when it's
30 appropriate. Appropriate here means that we shouldn't log updates during
31 the update of the tree as it might be unsafe (ie package rename). So
32 currently it seems that daemon has also initiate the rsync progress and
33 push the updates into database after rsync has finished successfully.
34 (You can already see how all kinds of weird corner cases start popping
35 up :P )
36
37 My current approach to logging is using the inotify [1] framework
38 present in Linux kernel since 2.6.13 (sorry BSD users, but this is
39 Gentoo Linux afterall) with the help of pyinotify [2].
40 So far there's only one drawback to using inotify - by default kernel
41 has a limit of 8192 directory watches allowed per-process (but portage
42 contains a lots of directories) so in order to use that approach one has
43 to bump the number watches using /proc/sys/fs/inotify/max_user_watches
44 tunable. 81920 has worked so far fine on my machine ;)
45
46 There was also a secondary approach suggested by my mentor Leio to parse
47 rsync log files, but I am a bit relucant about this idea.
48
49 Anyway, I'll leave this idea simmering here for a while and unless
50 someone comes up with a better idea (Yes, I have also thought about
51 scanning whole portage tree every x-hours), I'm going to implement the
52 daemon.
53
54 Plans for current week
55 ======================
56
57 As I currently consider the core issue solved, the next issue I have to
58 solve is how to take an ebuild, extract information about it and store
59 it in database. (Hint: pkgcore)
60
61 I'm not going take bigger tasks because I still have one quite hard exam
62 (thermodynamics and statistical physics) on 4th of June. And if I pass,
63 it is the last one.
64
65 PS. Sorry, no blog yet. I was using Zine, but it broke after I updated
66 my system to SQLAlchemy-0.6.
67
68 [1] http://en.wikipedia.org/wiki/Inotify
69 [2] http://trac.dbzteam.org/pyinotify
70
71 Päikest,
72 Priit Laes :)

Replies

Subject Author
Re: [gentoo-soc] Project Grumpy - weekly report #1 "Domen Kožar" <domen@×××.si>
Re: [gentoo-soc] Project Grumpy - weekly report #1 "Domen Kožar" <domen@×××.si>
Re: [gentoo-soc] Project Grumpy - weekly report #1 Arun Raghavan <arunissatan@×××××.com>