Gentoo Archives: gentoo-soc

From: "Antanas Uršulis" <antanas.ursulis@×××××.com>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] GSoC 2013: Log collector
Date: Mon, 29 Apr 2013 01:48:48
Message-Id: CALQr0erRWGqY2iwPENuRiZ0_TAX7ViP17-vdE9oQtxn5x=50BA@mail.gmail.com
In Reply to: Re: [gentoo-soc] GSoC 2013: Log collector by "Diego Elio Pettenò"
1 Hi Diego,
2
3 Thanks for the quick reply and sorry it took me this long to get back
4 to you - a couple other responsibilities have been piling up lately.
5
6 I've tried to assess what this project would involve, maybe this could
7 be a starting point for the proposal. A lot of this describes how the
8 current solution works, so changes might be necessary:
9
10 - conceptually the system should have 3 components: a log
11 collector&analyser, a storage backend and a frontend
12
13 - it would be integrated with portage:
14 --- portage would implement a client which can submit logs to the
15 collector, possibly providing information why the package failed
16 --- this connection between portage and the collector should rely on
17 as little as possible (because any packages providing that
18 functionality might break)
19 --- it should support IPv6 because that's what is used between the
20 container on the tinderbox and the box itself (here's a question
21 though: any technical reason why IPv6 was used? I admit I didn't look
22 into this too deeply)
23
24 - the collector & analyser:
25 --- receives logs over some protocol
26 --- should be able to group logs (receive several log files for a
27 failing package and keep them together) (this, depending on the
28 implementation, might be part of the portage integration)
29 --- matches each line against a regexp, we can look into something
30 more extensible if required
31 --- organises the files by hostname and submits them to the storage backend
32
33 - the storage backend:
34 --- I could start with Amazon's AWS and then move to something
35 standalone (how much data is there to store, actually? 1/10/100 GB?
36 and how large can a single log file become?)
37 --- keeps the logs and also a simple database that would hold
38 information about the log groups (package, date, links to log files,
39 etc.)
40
41 - the frontend:
42 --- displays a list of packages that have matches
43 --- should be integrated with bugzilla; one can see open bugs for a
44 selected package, and also file a new bug
45 --- should be password protected
46
47 Comments/additions greatly appreciated. Now, regarding the Gentoo
48 application template, I have actually a long time ago submitted a
49 one-line workaround patch[1] for openoffice, but that probably doesn't
50 qualify. Could you point me (general direction is ok) towards
51 something I could fix for my application?
52
53 Cheers,
54 Antanas
55
56 [1] https://bugs.gentoo.org/show_bug.cgi?id=306211
57
58 On Mon, Apr 22, 2013 at 4:55 PM, Diego Elio Pettenò
59 <flameeyes@×××××××××.eu> wrote:
60 > Hi Antanas,
61 >
62 > First of all, thanks to showing your interest in this project. I'll be
63 > the assigned mentor for the project if you're going to be working on
64 > it.
65 >
66 > To answer your concerns (in quite an unsorted fashion, I apologize), I
67 > would start with saying that what we're looking for with this project
68 > is to have not just an average log collector, but one that is
69 > integrated explicitly with components such as Portage and Bugzilla;
70 > the target of this project, for me, is to be able to use it for the
71 > tinderboxes I've been running (which are currently on-hold because I'm
72 > too busy with training at a new job).
73 >
74 > While I'm not discounting *any* particular technology right away,
75 > we're looking for something that works quickly and relatively
76 > easily... sometimes while technology is already out there that could
77 > work, it doesn't suit the workflow as it is right now. At the same
78 > time, for what I'm concerned you can start by keeping the use of
79 > Amazon's AWS services, if it helps you. The end goal is to avoid using
80 > it, but there is no hurry with that, as the costs associated with it
81 > are very marginal right now. The only important part here is that the
82 > data you store on AWS is not persistent when using the final
83 > collector.
84 >
85 > I don't have any problem with "wasting" the 10 days — but this does
86 > not give any "get out of trouble free" card for the evaluation, so
87 > keep it in mind, you might have to do some harder work in the first
88 > few days. But that does not say much, some people actually performs
89 > better under pressure, so it's up to you if you want to go for it or
90 > not :)
91 >
92 > If there is anything else you want to know, feel free to ask on the
93 > mailing list and I'll answer ASAP.
94 >
95 > Thanks,
96 > Diego
97 >
98 > Diego Elio Pettenò — Flameeyes
99 > flameeyes@×××××××××.eu — http://blog.flameeyes.eu/
100 >
101 >
102 > On Mon, Apr 22, 2013 at 2:19 AM, Antanas Uršulis
103 > <antanas.ursulis@×××××.com> wrote:
104 >> Hello,
105 >>
106 >> I'm a final year Computer Science undergraduate at the University of
107 >> Cambridge (UK). My main languages are C and C++, but my bachelor's project
108 >> involves working with and modifying the CIEL task-parallel execution system
109 >> written in Python, and I have also been using Python for my scripting needs.
110 >> I have been a Gentoo user since 2009 (switched from *buntu), if I can
111 >> remember correctly.
112 >>
113 >> I am interested in the "Log collector/analyzer for tinderbox" GSoC project,
114 >> not only because I would like to help the developers of my favourite Linux
115 >> distribution, but I could also see myself using an extension of the tool to
116 >> oversee my own systems (for example, computers at the Lithuanian National
117 >> Olympiad in Informatics, where I am a member of the technical staff).
118 >>
119 >> Of course, there are some things I would like to discuss. Firstly, are there
120 >> already any thoughts on a replacement for Amazon's SimpleDB storage? I would
121 >> see that as a major part of the project. Second, if I were to think of this
122 >> as a more general tool, with the possibility to support various log
123 >> providers (portage, apache, etc.) and analyzers in the future, what would
124 >> your opinion be on that? I think it would be possible to implement this
125 >> without any external dependencies, so that Portage remains light. Third,
126 >> after a quick search, I came upon the question: is there a reason not to
127 >> hook into systems such as Apache Flume + Elastic Search, or logstash, or
128 >> Scribe?
129 >>
130 >> Lastly, I would like to know up-front whether you would be OK with me only
131 >> being to able to fully focus on the project from the 7th June - I have exams
132 >> until then and cannot devote my time to this. That is 10 calendar days lost
133 >> from the "Students get to know mentors, read documentation, get up to speed
134 >> to begin working on their projects." phase.
135 >>
136 >> I'm happy to provide more background about myself and hope to hear from you!
137 >>
138 >> Regards,
139 >> Antanas Uršulis
140 >

Replies

Subject Author
Re: [gentoo-soc] GSoC 2013: Log collector Rich Freeman <rich0@g.o>
Re: [gentoo-soc] GSoC 2013: Log collector Brian Dolbec <dolsen@g.o>
Re: [gentoo-soc] GSoC 2013: Log collector "Diego Elio Pettenò" <flameeyes@×××××××××.eu>