Gentoo Archives: gentoo-dev

From: "Jan Kundrát" <jkt@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] UTF-8 encoding and file format of manuals
Date: Fri, 02 Jun 2006 12:31:10
Message-Id: 44802E85.9030306@gentoo.org
In Reply to: Re: [gentoo-dev] UTF-8 encoding and file format of manuals by Paul de Vrieze
1 Paul de Vrieze wrote:
2 > Would it be possible to do automatic detection and unicode conversion in the
3 > portage install stage? I think that would probably be the best option. At a
4 > later stage a simple detection and warning might be sufficient.
5
6 Tricky. You can parse a file and check if it's valid UTF-8 but the
7 problem is that you can't be sure if it isn't (eg) just a ISO8859-2
8 formatted one that happened to have interesting sequence of characters.
9
10 That said, there are some tools that tries to perform some magic
11 (statistical data analysis etc) and guess the correct encoding. [1]
12
13 [1] app-i18n/enca, http://trific.ath.cx/software/enca/
14
15 HTH,
16 -jkt
17
18 --
19 cd /local/pub && more beer > /dev/mouth

Attachments

File name MIME type
signature.asc application/pgp-signature