Gentoo Archives: gentoo-soc

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] GSoC2013 suggestion - Fix archives.g.o
Date: Mon, 08 Apr 2013 23:24:09
Message-Id: robbat2-20130408T222712-991090457Z@orbis-terrarum.net
In Reply to: [gentoo-soc] GSoC2013 suggestion - Fix archives.g.o by "Denis M."
1 On Tue, Apr 09, 2013 at 12:11:23AM +0200, Denis M. wrote:
2 > Hello,
3 > I want to suggest something here. We all know that archives.g.o hasn't
4 > been working for some months... From what I've heard from robbat, the
5 > system is still receiving and processing the emails but it's not
6 > 'archiving' them... I think it's a good idea if someone can gather a
7 > group of willing people to fix this, so we can have a gentoo 'official'
8 > place to read the mailing lists.
9 I'll provide some more details, in the hopes that somebody will take
10 this on as a project.
11
12 Yes, I am willing to mentor you and provide some help, but you'll need
13 to be fairly self-sufficient: I didn't build the archival system, I just
14 ended up being one of the maintainers.
15
16 We are using Mhonarc to handle the mails, with custom templates that
17 convert each mail to our GuideXML (a derivative of Docbook XML). An
18 update got applied accidentally, and since then, each run of mhonarc
19 either crashes or overrides the prior index.
20
21 The overwrite is a major problem, because there's a limit as to how many
22 emails mhonarc can handle in a single pass, and lots of our lists have
23 exceeded it.
24
25 Mhonarc itself is written in Perl. Gentoo also has some modifications so
26 that mhonarc generates consistent URLs even when asked to reindex a list
27 from scratch. This is done by injecting headers into the emails when
28 they are received by procmail (after postfix), and then the output
29 filenames for mhonarc are based on those headers.
30
31 As a project, we would prefer if mhonarc were continued to be used, but
32 as long as we use the same consistent URLs as mhonarc, I'm not actually
33 set in that discussion. Likewise, GuideXML would be useful, because then
34 it can just slot into the existing system, but also flexible. I don't
35 think the GuideXML part would be difficult at all either, it's really
36 just extracting a few headers, and then dumping the body into a single
37 tag.
38
39 --
40 Robin Hugh Johnson
41 Gentoo Linux: Developer, Trustee & Infrastructure Lead
42 E-Mail : robbat2@g.o
43 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85

Replies

Subject Author
Re: [gentoo-soc] GSoC2013 suggestion - Fix archives.g.o "Denis M." <god@××××××××.in>