1 |
This is a progress report #3 for Project Grumpy. |
2 |
|
3 |
Now, since report two, there has been a big change of focus in the course of |
4 |
development, which means that we decided to drop our beloved and also greatly |
5 |
hated NoSQL approach (MongoDB) and instead go forward using regular RDBMS |
6 |
which in our case is good old PostgreSQL. |
7 |
|
8 |
Although there were some compelling arguments (ease of use being my |
9 |
favorable) for MongoDB, the biggest nail in its coffing was its lack of |
10 |
"support" for it from Gentoo's infra team. For them it was just another |
11 |
application they would have to take care of and around interwebs there's lots |
12 |
of 'MongoDB ate my data' reports on how error-prone MongoDB actually is |
13 |
(although data volumes in most of these cases were so high, that I cannot |
14 |
really imagine Grumpy running into these problem). But I can really |
15 |
understand their concerns. Besides, if you take a look at list of commits in |
16 |
MongoDB's official development repository [1], you can see why people are a |
17 |
bit concerned ;) |
18 |
|
19 |
[1] http://github.com/mongodb/mongo/commits |
20 |
|
21 |
Therefore we switched over to PostgreSQL, using SQLAlchemy as a glue layer |
22 |
between the database and application. SQLAlchemy is a blessing because using |
23 |
its object relational model, you do not actually have to write any SQL (just |
24 |
take a peek in the 'grumpy_sync' utility). |
25 |
|
26 |
Progress so far |
27 |
=============== |
28 |
|
29 |
So far I have implemented portage -> database sync utility that is used to |
30 |
keep database in sync with portage content. Although it seems to handle most |
31 |
of the various portage quirks (like package moves via 'profiles/sync'), it |
32 |
still might run into issues in some corner cases and there is also minimal |
33 |
error recovery: it is currently designed to crash with RuntimeError when it |
34 |
detects something out of ordinary. |
35 |
|
36 |
Of course, the data model is far from complete - no proper handling of |
37 |
keywords, and I do not even store ebuild depends, rdepends and licenses in |
38 |
database - mainly because I currently don't have any use cases for these. |
39 |
|
40 |
Syncer can be found under 'utils' directory in the project directory. |
41 |
|
42 |
Future plans |
43 |
============ |
44 |
|
45 |
As model and controller are ready, next stop is to write rudimentary web app |
46 |
for browsing portage contents, so people can finally see that I actually |
47 |
haven't slacked all this time.. :) |
48 |
|
49 |
Also, during portage import I noticed some really simple QA issues like |
50 |
invalid herd names in 'metadata.xml'. Plan is to write a 'herdcheck' plugin |
51 |
and implement database storage for these QA issues. And as I cannot let |
52 |
anyone to simply write to database, I need to implement API to let plugins |
53 |
interact with app. |
54 |
|
55 |
Having API means that I can start integrating with other QA tools around |
56 |
there, mainly tinderbox. |
57 |
|
58 |
And finally, testing. I currently have simple doctesting and auditing (via |
59 |
PyFlakes) framework in place, but general unit testing is still missing. |
60 |
|
61 |
As you can see, I'm a bit lagging my proposed timeline - I still haven't |
62 |
actually started looking how to create the 30-day stabilisation and upstream |
63 |
version checkers, but hopefully I can pick up the speed because I can now say |
64 |
that I have passed the biggest hurdle.. :) |
65 |
|
66 |
And I have also dropped my 'secret agenda' of documenting my experience with |
67 |
NoSQL databases as a series of articles written during this project... |
68 |
|
69 |
Project info |
70 |
============ |
71 |
|
72 |
Git repository of Grumpy repo is available from [2]. |
73 |
|
74 |
[2] http://git.overlays.gentoo.org/gitweb/?p=proj/grumpy.git;a=summary |
75 |
|
76 |
Project's semi-official IRC channel is #gentoo-grumpy on Freenode network, |
77 |
if you run into troubles when testing out this project, then just ping me with |
78 |
a message. |
79 |
|
80 |
PS. Bonus points for those who noticed that I dropped 'weekly' ;) |