1 |
On Tue, Apr 09, 2013 at 12:11:23AM +0200, Denis M. wrote: |
2 |
> Hello, |
3 |
> I want to suggest something here. We all know that archives.g.o hasn't |
4 |
> been working for some months... From what I've heard from robbat, the |
5 |
> system is still receiving and processing the emails but it's not |
6 |
> 'archiving' them... I think it's a good idea if someone can gather a |
7 |
> group of willing people to fix this, so we can have a gentoo 'official' |
8 |
> place to read the mailing lists. |
9 |
I'll provide some more details, in the hopes that somebody will take |
10 |
this on as a project. |
11 |
|
12 |
Yes, I am willing to mentor you and provide some help, but you'll need |
13 |
to be fairly self-sufficient: I didn't build the archival system, I just |
14 |
ended up being one of the maintainers. |
15 |
|
16 |
We are using Mhonarc to handle the mails, with custom templates that |
17 |
convert each mail to our GuideXML (a derivative of Docbook XML). An |
18 |
update got applied accidentally, and since then, each run of mhonarc |
19 |
either crashes or overrides the prior index. |
20 |
|
21 |
The overwrite is a major problem, because there's a limit as to how many |
22 |
emails mhonarc can handle in a single pass, and lots of our lists have |
23 |
exceeded it. |
24 |
|
25 |
Mhonarc itself is written in Perl. Gentoo also has some modifications so |
26 |
that mhonarc generates consistent URLs even when asked to reindex a list |
27 |
from scratch. This is done by injecting headers into the emails when |
28 |
they are received by procmail (after postfix), and then the output |
29 |
filenames for mhonarc are based on those headers. |
30 |
|
31 |
As a project, we would prefer if mhonarc were continued to be used, but |
32 |
as long as we use the same consistent URLs as mhonarc, I'm not actually |
33 |
set in that discussion. Likewise, GuideXML would be useful, because then |
34 |
it can just slot into the existing system, but also flexible. I don't |
35 |
think the GuideXML part would be difficult at all either, it's really |
36 |
just extracting a few headers, and then dumping the body into a single |
37 |
tag. |
38 |
|
39 |
-- |
40 |
Robin Hugh Johnson |
41 |
Gentoo Linux: Developer, Trustee & Infrastructure Lead |
42 |
E-Mail : robbat2@g.o |
43 |
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 |