Gentoo Archives: gentoo-soc

From: Priit Laes <plaes@×××××.org>
To: gentoo-soc <gentoo-soc@l.g.o>
Cc: leio <leio@g.o>, ferringb <ferringb@g.o>
Subject: [gentoo-soc] Project Grumpy - report #3
Date: Sun, 04 Jul 2010 13:03:39
Message-Id: 1278248592.28713.32.camel@chi
1 This is a progress report #3 for Project Grumpy.
2
3 Now, since report two, there has been a big change of focus in the course of
4 development, which means that we decided to drop our beloved and also greatly
5 hated NoSQL approach (MongoDB) and instead go forward using regular RDBMS
6 which in our case is good old PostgreSQL.
7
8 Although there were some compelling arguments (ease of use being my
9 favorable) for MongoDB, the biggest nail in its coffing was its lack of
10 "support" for it from Gentoo's infra team. For them it was just another
11 application they would have to take care of and around interwebs there's lots
12 of 'MongoDB ate my data' reports on how error-prone MongoDB actually is
13 (although data volumes in most of these cases were so high, that I cannot
14 really imagine Grumpy running into these problem). But I can really
15 understand their concerns. Besides, if you take a look at list of commits in
16 MongoDB's official development repository [1], you can see why people are a
17 bit concerned ;)
18
19 [1] http://github.com/mongodb/mongo/commits
20
21 Therefore we switched over to PostgreSQL, using SQLAlchemy as a glue layer
22 between the database and application. SQLAlchemy is a blessing because using
23 its object relational model, you do not actually have to write any SQL (just
24 take a peek in the 'grumpy_sync' utility).
25
26 Progress so far
27 ===============
28
29 So far I have implemented portage -> database sync utility that is used to
30 keep database in sync with portage content. Although it seems to handle most
31 of the various portage quirks (like package moves via 'profiles/sync'), it
32 still might run into issues in some corner cases and there is also minimal
33 error recovery: it is currently designed to crash with RuntimeError when it
34 detects something out of ordinary.
35
36 Of course, the data model is far from complete - no proper handling of
37 keywords, and I do not even store ebuild depends, rdepends and licenses in
38 database - mainly because I currently don't have any use cases for these.
39
40 Syncer can be found under 'utils' directory in the project directory.
41
42 Future plans
43 ============
44
45 As model and controller are ready, next stop is to write rudimentary web app
46 for browsing portage contents, so people can finally see that I actually
47 haven't slacked all this time.. :)
48
49 Also, during portage import I noticed some really simple QA issues like
50 invalid herd names in 'metadata.xml'. Plan is to write a 'herdcheck' plugin
51 and implement database storage for these QA issues. And as I cannot let
52 anyone to simply write to database, I need to implement API to let plugins
53 interact with app.
54
55 Having API means that I can start integrating with other QA tools around
56 there, mainly tinderbox.
57
58 And finally, testing. I currently have simple doctesting and auditing (via
59 PyFlakes) framework in place, but general unit testing is still missing.
60
61 As you can see, I'm a bit lagging my proposed timeline - I still haven't
62 actually started looking how to create the 30-day stabilisation and upstream
63 version checkers, but hopefully I can pick up the speed because I can now say
64 that I have passed the biggest hurdle.. :)
65
66 And I have also dropped my 'secret agenda' of documenting my experience with
67 NoSQL databases as a series of articles written during this project...
68
69 Project info
70 ============
71
72 Git repository of Grumpy repo is available from [2].
73
74 [2] http://git.overlays.gentoo.org/gitweb/?p=proj/grumpy.git;a=summary
75
76 Project's semi-official IRC channel is #gentoo-grumpy on Freenode network,
77 if you run into troubles when testing out this project, then just ping me with
78 a message.
79
80 PS. Bonus points for those who noticed that I dropped 'weekly' ;)

Replies

Subject Author
Re: [gentoo-soc] Project Grumpy - report #3 "Petteri Räty" <betelgeuse@g.o>
Re: [gentoo-soc] Project Grumpy - report #3 Donnie Berkholz <dberkholz@g.o>