Gentoo Archives: gentoo-dev

From: Sebastian Pipping <webmaster@××××××××.org>
To: PackageKit users and developers list <packagekit@×××××××××××××××××.org>
Cc: gentoo-dev@l.g.o
Subject: [gentoo-dev] Inviting you to project "PackageMap"
Date: Fri, 12 Jun 2009 07:42:59
Message-Id: 4A3206DA.3090907@hartwork.org
1 Hello!
2
3
4 Quick (re-)introduction: My task for Gentoo/Google Summer of Code 2009
5 is to give Gentoo a Debian popcon equivalent, a tool to collect
6 statistics on "what package is installed how often". To achieve this
7 goal I'm extending Smolt (a tool currently doing similar things with
8 hardware information) by fine-tunable software stats gathering.
9
10
11 The plan we have for Smolt is to make it cross-distro, not just fit
12 Gentoo or Fedora. One point where the consequences and benefits of such
13 an approach can be seen clearly is with
14
15 counting packages from different distros into the same buckets.
16
17 What do I mean by that? Debian's Git counts for Gentoo's Git counts for
18 Fedora's, you know the list. With packages counted from accross distros
19 we can suddenly answer questions that we currently cannot answer, among them
20
21 - What globally popular packages are missing in distro X?
22 Let's say we don't have a package for product P. Do other distros
23 have one? They do, maybe we need one, too? They don't, maybe P is
24 not that important then?
25
26 - How many Linux users are approximately using program X in total?
27 Not just on Ubuntu or Arch - all across Linux, BSD, Solaris!
28
29 - Does distro X have 10 times the packages of Y or is it just
30 different splitting?
31
32 To count into the same bucket we use global identifiers for the
33 "products" that fall out of a package. Gentoo package "dev-util/git"
34 can produce product "cpe://a:git:git", Debian's "git-core" can, too.
35 That string before is a CPE URI [1], a concept close to package naming
36 in Java. This "intermediate language" allows us to relate package names
37 from distro X with those of distro Y and answer various questions from
38 that data.
39
40 To do such mapping we need code (or a "service") that does the mapping
41 for us and base of collected data that the service can operate on. Both
42 of these is project "PackageMap"
43
44 I have started populating the database with packages (currently 312
45 in number) made from information extracted from the Gentoo tree
46 and the National Vulnerability Database. Latter holds many CPEs.
47 Let me state clearly that packagemap is not about Gentoo in particular.
48 Sure, the initial data has lots of Gentoo in it but the whole point of
49 the project is to get information and people from different distros
50 together.
51
52 To see what these 312 packages maps look like at the moment you best do
53 a few clicks through the database folder yourself:
54 http://git.goodpoint.de/?p=packagemap.git;a=tree;f=database
55
56 Also, there are Relax NG schema and DTD for validation, more
57 documentation than I usually write and a few scripts:
58 http://git.goodpoint.de/?p=packagemap.git;a=tree
59
60 By now I hope you have gained interest in what this can become.
61 Your active participation is highly appreciated.
62 A few minutes from everyone can make a huge difference here.
63 If you want write access to the repo - mail me: sebastian@×××××××.org.
64
65 Please have a look at the Git repository linked above and ask questions.
66 I propose to keep the related Gentoo stuff on gentoo-dev and everything
67 else on the packagekit list. I hope that works out well.
68
69 Thanks for reading up to this point.
70
71
72
73 Sebastian
74
75
76
77 PS: I'm aware "hartwork.org" might not make a good longterm location for
78 DTDs, XML namespaces and such for a cross-distro project. Any ideas
79 where to put them best?
80
81 [1] http://cpe.mitre.org/

Replies

Subject Author
Re: [gentoo-dev] Inviting you to project "PackageMap" "Petteri Räty" <betelgeuse@g.o>