1 |
Hi Diego, |
2 |
|
3 |
Thanks for the quick reply and sorry it took me this long to get back |
4 |
to you - a couple other responsibilities have been piling up lately. |
5 |
|
6 |
I've tried to assess what this project would involve, maybe this could |
7 |
be a starting point for the proposal. A lot of this describes how the |
8 |
current solution works, so changes might be necessary: |
9 |
|
10 |
- conceptually the system should have 3 components: a log |
11 |
collector&analyser, a storage backend and a frontend |
12 |
|
13 |
- it would be integrated with portage: |
14 |
--- portage would implement a client which can submit logs to the |
15 |
collector, possibly providing information why the package failed |
16 |
--- this connection between portage and the collector should rely on |
17 |
as little as possible (because any packages providing that |
18 |
functionality might break) |
19 |
--- it should support IPv6 because that's what is used between the |
20 |
container on the tinderbox and the box itself (here's a question |
21 |
though: any technical reason why IPv6 was used? I admit I didn't look |
22 |
into this too deeply) |
23 |
|
24 |
- the collector & analyser: |
25 |
--- receives logs over some protocol |
26 |
--- should be able to group logs (receive several log files for a |
27 |
failing package and keep them together) (this, depending on the |
28 |
implementation, might be part of the portage integration) |
29 |
--- matches each line against a regexp, we can look into something |
30 |
more extensible if required |
31 |
--- organises the files by hostname and submits them to the storage backend |
32 |
|
33 |
- the storage backend: |
34 |
--- I could start with Amazon's AWS and then move to something |
35 |
standalone (how much data is there to store, actually? 1/10/100 GB? |
36 |
and how large can a single log file become?) |
37 |
--- keeps the logs and also a simple database that would hold |
38 |
information about the log groups (package, date, links to log files, |
39 |
etc.) |
40 |
|
41 |
- the frontend: |
42 |
--- displays a list of packages that have matches |
43 |
--- should be integrated with bugzilla; one can see open bugs for a |
44 |
selected package, and also file a new bug |
45 |
--- should be password protected |
46 |
|
47 |
Comments/additions greatly appreciated. Now, regarding the Gentoo |
48 |
application template, I have actually a long time ago submitted a |
49 |
one-line workaround patch[1] for openoffice, but that probably doesn't |
50 |
qualify. Could you point me (general direction is ok) towards |
51 |
something I could fix for my application? |
52 |
|
53 |
Cheers, |
54 |
Antanas |
55 |
|
56 |
[1] https://bugs.gentoo.org/show_bug.cgi?id=306211 |
57 |
|
58 |
On Mon, Apr 22, 2013 at 4:55 PM, Diego Elio Pettenò |
59 |
<flameeyes@×××××××××.eu> wrote: |
60 |
> Hi Antanas, |
61 |
> |
62 |
> First of all, thanks to showing your interest in this project. I'll be |
63 |
> the assigned mentor for the project if you're going to be working on |
64 |
> it. |
65 |
> |
66 |
> To answer your concerns (in quite an unsorted fashion, I apologize), I |
67 |
> would start with saying that what we're looking for with this project |
68 |
> is to have not just an average log collector, but one that is |
69 |
> integrated explicitly with components such as Portage and Bugzilla; |
70 |
> the target of this project, for me, is to be able to use it for the |
71 |
> tinderboxes I've been running (which are currently on-hold because I'm |
72 |
> too busy with training at a new job). |
73 |
> |
74 |
> While I'm not discounting *any* particular technology right away, |
75 |
> we're looking for something that works quickly and relatively |
76 |
> easily... sometimes while technology is already out there that could |
77 |
> work, it doesn't suit the workflow as it is right now. At the same |
78 |
> time, for what I'm concerned you can start by keeping the use of |
79 |
> Amazon's AWS services, if it helps you. The end goal is to avoid using |
80 |
> it, but there is no hurry with that, as the costs associated with it |
81 |
> are very marginal right now. The only important part here is that the |
82 |
> data you store on AWS is not persistent when using the final |
83 |
> collector. |
84 |
> |
85 |
> I don't have any problem with "wasting" the 10 days — but this does |
86 |
> not give any "get out of trouble free" card for the evaluation, so |
87 |
> keep it in mind, you might have to do some harder work in the first |
88 |
> few days. But that does not say much, some people actually performs |
89 |
> better under pressure, so it's up to you if you want to go for it or |
90 |
> not :) |
91 |
> |
92 |
> If there is anything else you want to know, feel free to ask on the |
93 |
> mailing list and I'll answer ASAP. |
94 |
> |
95 |
> Thanks, |
96 |
> Diego |
97 |
> |
98 |
> Diego Elio Pettenò — Flameeyes |
99 |
> flameeyes@×××××××××.eu — http://blog.flameeyes.eu/ |
100 |
> |
101 |
> |
102 |
> On Mon, Apr 22, 2013 at 2:19 AM, Antanas Uršulis |
103 |
> <antanas.ursulis@×××××.com> wrote: |
104 |
>> Hello, |
105 |
>> |
106 |
>> I'm a final year Computer Science undergraduate at the University of |
107 |
>> Cambridge (UK). My main languages are C and C++, but my bachelor's project |
108 |
>> involves working with and modifying the CIEL task-parallel execution system |
109 |
>> written in Python, and I have also been using Python for my scripting needs. |
110 |
>> I have been a Gentoo user since 2009 (switched from *buntu), if I can |
111 |
>> remember correctly. |
112 |
>> |
113 |
>> I am interested in the "Log collector/analyzer for tinderbox" GSoC project, |
114 |
>> not only because I would like to help the developers of my favourite Linux |
115 |
>> distribution, but I could also see myself using an extension of the tool to |
116 |
>> oversee my own systems (for example, computers at the Lithuanian National |
117 |
>> Olympiad in Informatics, where I am a member of the technical staff). |
118 |
>> |
119 |
>> Of course, there are some things I would like to discuss. Firstly, are there |
120 |
>> already any thoughts on a replacement for Amazon's SimpleDB storage? I would |
121 |
>> see that as a major part of the project. Second, if I were to think of this |
122 |
>> as a more general tool, with the possibility to support various log |
123 |
>> providers (portage, apache, etc.) and analyzers in the future, what would |
124 |
>> your opinion be on that? I think it would be possible to implement this |
125 |
>> without any external dependencies, so that Portage remains light. Third, |
126 |
>> after a quick search, I came upon the question: is there a reason not to |
127 |
>> hook into systems such as Apache Flume + Elastic Search, or logstash, or |
128 |
>> Scribe? |
129 |
>> |
130 |
>> Lastly, I would like to know up-front whether you would be OK with me only |
131 |
>> being to able to fully focus on the project from the 7th June - I have exams |
132 |
>> until then and cannot devote my time to this. That is 10 calendar days lost |
133 |
>> from the "Students get to know mentors, read documentation, get up to speed |
134 |
>> to begin working on their projects." phase. |
135 |
>> |
136 |
>> I'm happy to provide more background about myself and hope to hear from you! |
137 |
>> |
138 |
>> Regards, |
139 |
>> Antanas Uršulis |
140 |
> |