1 |
On Wed, Jun 08, 2011 at 05:19:33PM +0200, "Paweł Hajdan, Jr." wrote: |
2 |
> On 6/8/11 4:36 PM, Vikraman wrote: |
3 |
> > I'm working on the `Package statistics` project this year. Till now, I |
4 |
> > have managed to write a client and server[0] to collect the following |
5 |
> > information from hosts: |
6 |
> |
7 |
> Excellent, good luck with the idea! I think that better information |
8 |
> about how Gentoo is actually used will greatly help improving it. |
9 |
> |
10 |
|
11 |
Well, that information cannot be collected automatically, can it ? |
12 |
|
13 |
> > Is there a need to collect files installed by a package ? Doesn't PFL[1] |
14 |
> > already provide that ? |
15 |
> |
16 |
> Well, PFL is not an official Gentoo project. It might be useful, but I |
17 |
> wouldn't say it's a priority. |
18 |
> |
19 |
> > Please provide some feedback on what other data should be collected, etc. |
20 |
> |
21 |
> In my opinion it's *not* about collecting as much data as possible. I |
22 |
> think it's most important to get the core functionality working really |
23 |
> well, and convincing as large percentage of users as possible to enable |
24 |
> reporting the statistics (to make the results - hopefully - accurately |
25 |
> represent the user base). Please note that in some cases it may mean |
26 |
> collecting _less_ data, or thinking more about the privacy of the users. |
27 |
> |
28 |
> For me, as a developer, even a list of packages sorted by popularity |
29 |
> (aka Debian/Ubuntu popcon) would be very useful. |
30 |
> |
31 |
> Ah, and maybe files in /etc/portage: package.keywords and so on. It |
32 |
> could be useful to see what people are masking/unmasking, that may be an |
33 |
> indication of stale stabilizations or brokenness hitting the tree. |
34 |
> Anyway, I'd call it an enhancement. |
35 |
> |
36 |
> > Also, I'm starting work on the webUI, and would like some |
37 |
> > recommendations for stats pages, such as: |
38 |
> > |
39 |
> > * Packages installed sorted by users |
40 |
> |
41 |
> Cool! |
42 |
> |
43 |
> > * Top arches, keywords, profiles |
44 |
> |
45 |
> And percentage of ~arch vs arch users? |
46 |
> |
47 |
> > * Most enabled, disabled useflags per package/globally |
48 |
> |
49 |
> Also great, especially the per-package variant. It'd be also useful to |
50 |
> have per-profile data, to better tune the profile defaults. |
51 |
> |
52 |
> > [0] |
53 |
> > http://git.overlays.gentoo.org/gitweb/?p=proj/gentoostats.git;a=commit;h=1b9697a090515d2a373e83b1094d6e08ec405c02 |
54 |
> |
55 |
> I took a quick look at the code. Some random comments: |
56 |
> |
57 |
> - it uses portage Python API a lot. But it's not stable, or at least not |
58 |
> guaranteed to be stable. Have you considered using helpers like portageq |
59 |
> (or eventually enhancing those helpers)? |
60 |
> |
61 |
> - make the licensing super-clear (a LICENSE file, possibly some header |
62 |
> in every source file, and so on) |
63 |
> |
64 |
> - how about submitting the data over HTTPS and not HTTP to better help |
65 |
> privacy? |
66 |
|
67 |
Fair points, thanks! |
68 |
|
69 |
> |
70 |
> - don't leave exception handling as a TODO; it should be a part of your |
71 |
> design, not an afterthought |
72 |
> |
73 |
> - instead of or in addition to the setup.txt file, how about just |
74 |
> writing the real setup.py file for distutils? |
75 |
> |
76 |
|
77 |
Yes, these are part of my sub-goals for next week. |
78 |
|
79 |
-- |
80 |
Vikraman |