Gentoo Archives: gentoo-dev

From: Vikraman <vikraman.choudhury@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Gentoo package statistics -- GSoC 2011
Date: Wed, 08 Jun 2011 18:03:06
Message-Id: 20110608180142.GA7520@felicia
In Reply to: Re: [gentoo-dev] Gentoo package statistics -- GSoC 2011 by "Paweł Hajdan
1 On Wed, Jun 08, 2011 at 05:19:33PM +0200, "Paweł Hajdan, Jr." wrote:
2 > On 6/8/11 4:36 PM, Vikraman wrote:
3 > > I'm working on the `Package statistics` project this year. Till now, I
4 > > have managed to write a client and server[0] to collect the following
5 > > information from hosts:
6 >
7 > Excellent, good luck with the idea! I think that better information
8 > about how Gentoo is actually used will greatly help improving it.
9 >
10
11 Well, that information cannot be collected automatically, can it ?
12
13 > > Is there a need to collect files installed by a package ? Doesn't PFL[1]
14 > > already provide that ?
15 >
16 > Well, PFL is not an official Gentoo project. It might be useful, but I
17 > wouldn't say it's a priority.
18 >
19 > > Please provide some feedback on what other data should be collected, etc.
20 >
21 > In my opinion it's *not* about collecting as much data as possible. I
22 > think it's most important to get the core functionality working really
23 > well, and convincing as large percentage of users as possible to enable
24 > reporting the statistics (to make the results - hopefully - accurately
25 > represent the user base). Please note that in some cases it may mean
26 > collecting _less_ data, or thinking more about the privacy of the users.
27 >
28 > For me, as a developer, even a list of packages sorted by popularity
29 > (aka Debian/Ubuntu popcon) would be very useful.
30 >
31 > Ah, and maybe files in /etc/portage: package.keywords and so on. It
32 > could be useful to see what people are masking/unmasking, that may be an
33 > indication of stale stabilizations or brokenness hitting the tree.
34 > Anyway, I'd call it an enhancement.
35 >
36 > > Also, I'm starting work on the webUI, and would like some
37 > > recommendations for stats pages, such as:
38 > >
39 > > * Packages installed sorted by users
40 >
41 > Cool!
42 >
43 > > * Top arches, keywords, profiles
44 >
45 > And percentage of ~arch vs arch users?
46 >
47 > > * Most enabled, disabled useflags per package/globally
48 >
49 > Also great, especially the per-package variant. It'd be also useful to
50 > have per-profile data, to better tune the profile defaults.
51 >
52 > > [0]
53 > > http://git.overlays.gentoo.org/gitweb/?p=proj/gentoostats.git;a=commit;h=1b9697a090515d2a373e83b1094d6e08ec405c02
54 >
55 > I took a quick look at the code. Some random comments:
56 >
57 > - it uses portage Python API a lot. But it's not stable, or at least not
58 > guaranteed to be stable. Have you considered using helpers like portageq
59 > (or eventually enhancing those helpers)?
60 >
61 > - make the licensing super-clear (a LICENSE file, possibly some header
62 > in every source file, and so on)
63 >
64 > - how about submitting the data over HTTPS and not HTTP to better help
65 > privacy?
66
67 Fair points, thanks!
68
69 >
70 > - don't leave exception handling as a TODO; it should be a part of your
71 > design, not an afterthought
72 >
73 > - instead of or in addition to the setup.txt file, how about just
74 > writing the real setup.py file for distutils?
75 >
76
77 Yes, these are part of my sub-goals for next week.
78
79 --
80 Vikraman

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-dev] Gentoo package statistics -- GSoC 2011 Hans de Graaff <graaff@g.o>