Gentoo Archives: gentoo-soc

From: Joachim Bartosik <jbartosik@×××××.com>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] Re: Gentoo stats server/client,
Date: Fri, 27 Mar 2009 00:07:12
Message-Id: 53d3ab620903261707t309879f8u56a9c3d052836e5b@mail.gmail.com
1 Argh. I misspelled address... I should have been delved yesterday.
2
3 On Tue, Mar 24, 2009 at 06:00, Alec Warner <antarus@g.o> wrote:
4
5 > > To solve this problem I'd use less comfortable for users solution: user
6 > wold
7 > > have to create an account using an email ( of course it wouldn't be
8 > stored,
9 > > I'd store some one-way injective function of it*) and click an emailed
10 > link.
11 > > There would be no need for password - to confirm his[her] actions [s]he
12 > > would just click an emailed link.
13 >
14 > I'm a bit concerned that if the function is one-way, how you will know
15 > where to send these email links. Would the user have to input their
16 > email address when making changes, and you gather it from the POST/GET
17 > data?
18
19
20 Yes. I'd rather not store user emails so it will be impossible for curious
21 people to access them. I think it's one thing less to worry about.
22
23 After some thinking I believe my first idea ( when there is need for users
24 to confirm something they give us their emails, we send them email they
25 click link) has some advantages ( addresses are not even temporally stored,
26 users don't need to remember logins or passwords so there is no need to
27 worry about password recovery) but might be annoying for users ( especially
28 "premium users" who will have to do many actions at once or users who have
29 to wait longer then 5 minuets to receive authentication email(s)).
30
31 More comfortable for users would be to use their email addresses as logins.
32 Again there is no need to store addresses ( we get address/ password
33 combination on log in, check if it is proper and we no longer need it). If
34 users forget their passwords they give us their addresses and click some "I
35 forgot my password so please give me a new one" button we send them new
36 password/ link to set new password page/ whatever password recovery
37 information and again we do not need to store it.
38
39 It's not very important for my idea so I don't mind changing my mind.
40
41 > Each user ( email) would have a hosts** limit ( probably set in server
42 > > configuration) 2 or 3 by default ( enough for average user, not enough to
43 > > easily spoil data). After some time of inactivity host/ account would be
44 > > removed.
45 >
46 > Just make the host limit configurable and we can debate defaults
47 > later. Certainly there should be a limit and there should be some way
48 > to request more machines; we don't want users to create lots of
49 > different accounts to contain their legitimate data. Retiring
50 > inactive accounts is a good idea, +1
51 >
52
53 Making hosts limit configurable should be really easy ( it's important to
54 remember to make it configurable though ;).
55
56
57 > FYI, I have a working python client for portage/pkgcore/paludis that
58 > collects this data and outputs some XML (that I was later going to
59 > POST to a RESTful interface..that I never wrote ;p)
60 >
61
62 I'd love to take a look at it - it could save some time
63
64 REST is HTTP, and you could do it with HTML, but it would be ugly ;p
65
66
67 Users shouldn't look at it ( client would fetch it, grab important message
68 and show the message to them if necessary).
69
70
71
72 > this stuff sounds pretty standard, and I think if any prospective
73 > students have developed webapps before its probably not very time
74 > consuming. I would prefer python over PHP for maintainability within
75 > gentoo, but if you feel most comfortable in PHP I'm not going to make
76 > you use a specific language provided the code is readable.
77
78
79 I didn't try python for writing web stuff so I don't know if very simple
80 thing can be done very simply ( knowing python I'd guess it's so but I know
81 that writing this using PHP would be very simple so I wrote it). I'll try
82 it ( could you recommend what should I check out? (
83 "Pylons<http://pylonshq.com/>(0.9.7 Released 2009-02-23) a lightweight
84 Web framework emphasizing
85 flexibility and rapid development." ?)). I'd prefer to use python for that
86 because I think mixing languages can cause unnecessary problems.
87
88 > Data gathering:
89 > > It'd take data provided by user communication module, decompress it,
90 > apply
91 > > deltas etc. to create all-the-information-available about current state
92 > of
93 > > hosts.
94 >
95 > Here I'm thinking a reporting API that generates datasets. Someone
96 > else (or you) can later render the report with javascript or
97 > something.
98
99
100 As for [RESTful] server part of it I don't see a problem with that ( maybe
101 time... ). But I really hate java script. I'll think about it.
102
103
104 > > Cleaner:
105 > > Run from time to time ( by cron, frequency adjusted to needs). Remove
106 > hosts
107 > > and users that do not send data ( to conserve space) etc.
108 >
109 > Probably a trivial add on feature provided we store last modification data.
110
111
112 Trivial add on feature but also an important one ( if old host data stays in
113 database forever it will eventually spoil statistics).
114
115
116 > > Achiever:
117 > > Run from time to time ( cron, as needed). Data gathering provides only
118 > > information about hosts *right now*. Achiever would generate statistics (
119 > > like package popularity ( % hosts that installed it)) and store them to
120 > make
121 > > historical data available ( storing all host states history would be
122 > > extremely excessive).
123 >
124 > I'll call this a stretch goal (eg we think about it but delay
125 > implementation until the dataset gets fairly large).
126
127
128 I think it is important to store historical data but you're right that there
129 will be no need for it until there is enough data to make history
130 interesting.
131
132 --
133 Joachim