Gentoo Archives: gentoo-soc

From: "Antanas Uršulis" <antanas.ursulis@×××××.com>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] GSoC 2013: Log collector
Date: Tue, 30 Apr 2013 10:13:59
Message-Id: CALQr0eq7_dQGi9ccUBiCm4GHAGym3iKAYMuaVtRRes1C+q+2OA@mail.gmail.com
In Reply to: Re: [gentoo-soc] GSoC 2013: Log collector by "Diego Elio Pettenò"
1 Thanks for the input, I'll write up a draft application and mail it
2 here later on. In the mean time, I haven't done directly related work,
3 but I could point to my bachelor's project [1]. It's very much a work
4 in progress, but the work I did in Cambridge over last summer is not
5 public yet (really tiny bits of it are here [2]).
6
7 [1] https://github.com/aursulis/ciel/tree/shm_blockstore
8 [2] https://github.com/awm22/NetFPGA-P33/pull/1
9
10 On Mon, Apr 29, 2013 at 8:28 AM, Diego Elio Pettenò
11 <flameeyes@×××××××××.eu> wrote:
12 > Hi Antanas,
13 >
14 > On 29/04/2013 02:48, Antanas Uršulis wrote:
15 >> I've tried to assess what this project would involve, maybe this could
16 >> be a starting point for the proposal. A lot of this describes how the
17 >> current solution works, so changes might be necessary:
18 >
19 > Most likely, yes.
20 >
21 >> - conceptually the system should have 3 components: a log
22 >> collector&analyser, a storage backend and a frontend
23 >
24 > Correct.
25 >
26 >> - it would be integrated with portage:
27 >> --- portage would implement a client which can submit logs to the
28 >> collector, possibly providing information why the package failed
29 >> --- this connection between portage and the collector should rely on
30 >> as little as possible (because any packages providing that
31 >> functionality might break)
32 >
33 > Also correct.
34 >
35 >> --- it should support IPv6 because that's what is used between the
36 >> container on the tinderbox and the box itself (here's a question
37 >> though: any technical reason why IPv6 was used? I admit I didn't look
38 >> into this too deeply)
39 >
40 > Yeah the technical reason is actually two fold:
41 >
42 > - the host for the tinderboxes only has one IPv4 address, so it was
43 > either NAT or IPv6; given that the tinderbox runs isolated networking
44 > through proxy to stop packages using network at build time, NAT was not
45 > a great idea; using IPv6 means that I can still jump on the hosts either
46 > from another IPv6-enabled system or, like in my previous and current
47 > office, straight from my IPv6-enabled workstation;
48 > - by using IPv6, the name of the tinderbox is found simply by doing a
49 > reverse-lookup of the address, as all the tinderboxes have proper records.
50 >
51 >> - the collector & analyser:
52 >> --- receives logs over some protocol
53 >> --- should be able to group logs (receive several log files for a
54 >> failing package and keep them together) (this, depending on the
55 >> implementation, might be part of the portage integration)
56 >> --- matches each line against a regexp, we can look into something
57 >> more extensible if required
58 >> --- organises the files by hostname and submits them to the storage backend
59 >
60 > Also correct.
61 >
62 >> - the storage backend:
63 >> --- I could start with Amazon's AWS and then move to something
64 >> standalone (how much data is there to store, actually? 1/10/100 GB?
65 >> and how large can a single log file become?)
66 >
67 > I've seen log files getting over 1GB (yes I know it's crazy) but that's
68 > relatively rare. I don't have a quick assessment of the total storage
69 > over the past year unfortunately.
70 >
71 >> --- keeps the logs and also a simple database that would hold
72 >> information about the log groups (package, date, links to log files,
73 >> etc.)
74 >
75 > Correct.
76 >
77 >> - the frontend:
78 >> --- displays a list of packages that have matches
79 >> --- should be integrated with bugzilla; one can see open bugs for a
80 >> selected package, and also file a new bug
81 >> --- should be password protected
82 >
83 > Also correct. Do note that one of the things that the frontend has to do
84 > is being able to _attach_ the data rather than just link to it (which is
85 > what I've been doing myself up to now).
86 >
87 >> Comments/additions greatly appreciated. Now, regarding the Gentoo
88 >> application template, I have actually a long time ago submitted a
89 >> one-line workaround patch[1] for openoffice, but that probably doesn't
90 >> qualify. Could you point me (general direction is ok) towards
91 >> something I could fix for my application?
92 >
93 > I'm not sure if it makes much sense to sweat fixing something, given
94 > that we're talking about writing a series of webapps and systems. It
95 > might be better for you to point at any kind of similar work you ever done.
96 >
97 > --
98 > Diego Elio Pettenò — Flameeyes
99 > flameeyes@×××××××××.eu — http://blog.flameeyes.eu/
100 >

Replies

Subject Author
Re: [gentoo-soc] GSoC 2013: Log collector "Antanas Uršulis" <antanas.ursulis@×××××.com>