Gentoo Archives: gentoo-soc

From:	"Domen Kožar" <domen@×××.si>
To:	gentoo-soc@l.g.o
Subject:	Re: [gentoo-soc] Project Grumpy - weekly report #1
Date:	Tue, 01 Jun 2010 07:19:08
Message-Id:	`1275376741.5272.77.camel@oblak.fubar.si`
In Reply to:	[gentoo-soc] Project Grumpy - weekly report #1 by Priit Laes

1	Also, good luck with thermodynamics;)
2
3	On Tue, 2010-06-01 at 10:10 +0300, Priit Laes wrote:
4	> This is a weekly progress report no. 1 for Project Grumpy.
5	>
6	> As this is the first publicly visible announcement, I am also going to
7	> give a short overview about the project itself.
8	>
9	> The aim of this project is to create a database containing various
10	> developer-related metadata about packages in the Gentoo portage.
11	> Metadata that we are going to store can be used for different kinds of
12	> purposes, some examples include upstream version checks and giving
13	> notifications to developers who are interested about that package. And
14	> eventually provide a nice web and API interface to access this data.
15	>
16	> Project's semi-official IRC channel is #gentoo-grumpy on Freenode
17	> network. Just step in say "Hi!" :)
18	>
19	> Last week's progress report
20	> ===========================
21	>
22	> My first week went a bit slowly due to having some "unfinished business"
23	> that I needed to finish, and also because of two exams (which went
24	> fine).
25	>
26	> The core issue I wrestled during this week was how to keep portage
27	> contents and database contents in sync - ie. when ebuild is modified,
28	> removed or added, how to make sure that database contents correspond to
29	> the portage contents.
30	>
31	> The solution that I came up with is to use a simple daemon that logs
32	> changes to portage tree and modifies database contents when it's
33	> appropriate. Appropriate here means that we shouldn't log updates during
34	> the update of the tree as it might be unsafe (ie package rename). So
35	> currently it seems that daemon has also initiate the rsync progress and
36	> push the updates into database after rsync has finished successfully.
37	> (You can already see how all kinds of weird corner cases start popping
38	> up :P )
39	>
40	> My current approach to logging is using the inotify [1] framework
41	> present in Linux kernel since 2.6.13 (sorry BSD users, but this is
42	> Gentoo Linux afterall) with the help of pyinotify [2].
43	> So far there's only one drawback to using inotify - by default kernel
44	> has a limit of 8192 directory watches allowed per-process (but portage
45	> contains a lots of directories) so in order to use that approach one has
46	> to bump the number watches using /proc/sys/fs/inotify/max_user_watches
47	> tunable. 81920 has worked so far fine on my machine ;)
48	>
49	> There was also a secondary approach suggested by my mentor Leio to parse
50	> rsync log files, but I am a bit relucant about this idea.
51	>
52	> Anyway, I'll leave this idea simmering here for a while and unless
53	> someone comes up with a better idea (Yes, I have also thought about
54	> scanning whole portage tree every x-hours), I'm going to implement the
55	> daemon.
56	>
57	> Plans for current week
58	> ======================
59	>
60	> As I currently consider the core issue solved, the next issue I have to
61	> solve is how to take an ebuild, extract information about it and store
62	> it in database. (Hint: pkgcore)
63	>
64	> I'm not going take bigger tasks because I still have one quite hard exam
65	> (thermodynamics and statistical physics) on 4th of June. And if I pass,
66	> it is the last one.
67	>
68	> PS. Sorry, no blog yet. I was using Zine, but it broke after I updated
69	> my system to SQLAlchemy-0.6.
70	>
71	> [1] http://en.wikipedia.org/wiki/Inotify
72	> [2] http://trac.dbzteam.org/pyinotify
73	>
74	> Päikest,
75	> Priit Laes :)
76	>

Attachments

File name	MIME type
signature.asc	application/pgp-signature

Report Message

Find on MARC Find on Google Groups