Gentoo Archives: gentoo-project

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-project@l.g.o
Subject: Re: [gentoo-project] Council meeting: Tuesday 2013-11-12, 19:00 UTC
Date: Thu, 07 Nov 2013 06:01:24
Message-Id: 20131107060120.GG5763@orbis-terrarum.net
In Reply to: Re: [gentoo-project] Council meeting: Tuesday 2013-11-12, 19:00 UTC by "Paweł Hajdan
1 On Wed, Nov 06, 2013 at 08:58:36PM -0800, "Paweł Hajdan, Jr." wrote:
2 > On 11/6/13 10:43 AM, Andreas K. Huettel wrote:
3 > >> On Wed, Nov 06, 2013 at 09:18:43AM +0100, Andreas K. Huettel wrote:
4 > >>> 8. Revival of archives.gentoo.org
5 > >>
6 > >> There isn't any reference for this one, who's spear-heading it; as infra
7 > >> I'd like to know. We do have the raw emails for the archive, what's
8 > >> broken is strictly the web interface.
9 > >
10 > > I am, and the only available reference is the announcement itself, see the
11 > > paragraph at the top about missing references :).
12 > >
13 > > My only intention for this topic so far is to ask my council colleagues for
14 > > their opinion on a general statement like "archives.gentoo.org functionality
15 > > was useful, and it would be nice to have our own online, definitive archive of
16 > > the more important mailing lists back working again at some point in the
17 > > future".
18 > Curious: does anyone have any doubts about usefulness of archives.g.o?
19 >
20 > Then, do you know what actually broke? It's really surprising to me
21 > since it seems that hardly anything changes there.
22 We used a custom template in mhonarc to generate a variant of
23 Guide/ProjectXML from the emails. Along with that template, there was
24 some custom code to ensure we generated consistent IDs even if the prior
25 mhonarc listing was damaged; so that a link to an email once it was
26 posted would always be consistent.
27
28 The parts above we still have... what we don't have, is some fixes that
29 allowed mhonarc to scale to the crazy number of emails we were putting
30 into the archive. It's default was generating an entire new index and
31 each message again, every time it was run. We hacked incremental support
32 onto there, that was lost.
33
34 Along with the rest of Gentoo that is moving away from Guide/ProjectXML;
35 archives need to move to the future, but it needs to be something
36 scalable. The archive of raw email itself exceeds 20GiB in size; more
37 than 75% of which should probably be public (the remaining fraction is
38 stuff like core/trustees/council etc).
39
40 --
41 Robin Hugh Johnson
42 Gentoo Linux: Developer, Trustee & Infrastructure Lead
43 E-Mail : robbat2@g.o
44 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85