Gentoo Archives: gentoo-dev

From: Ulrich Mueller <ulm@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] News item: LINGUAS USE_EXPAND renamed to L10N
Date: Mon, 06 Jun 2016 18:07:26
Message-Id: 22357.48078.461280.261840@a1i15.kph.uni-mainz.de
In Reply to: Re: [gentoo-dev] News item: LINGUAS USE_EXPAND renamed to L10N by "Chí-Thanh Christopher Nguyễn"
1 >>>>> On Mon, 6 Jun 2016, Chí-Thanh Christopher Nguyễn wrote:
2
3 > I'm not totally convinced yet.
4 > Following the BCP-47 spec the format is
5
6 > Language-Tag = langtag ; normal language tags
7 > langtag = language
8 > ["-" script]
9 > ["-" region]
10 > *("-" variant)
11 > *("-" extension)
12 > ["-" privateuse]
13
14 > So using the language ca, region es, and variant valencia, the
15 > BCP-47 language tag is ca-es-valencia (or ca-valencia if you omit
16 > the region).
17
18 Right. Or rather, ca-ES-valencia for the former, because all caps are
19 preferred for the region tag.
20
21 > POSIX.1-2008[2] as you mentioned defines a slightly different format
22 > for locales
23
24 > language[_territory][.codeset]
25
26 > Only LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
27 > LC_TIME additionally accept specification of a modifier.
28
29 > [language[_territory][.codeset][@modifier]]
30
31 > Where territory is implementation defined and the modifier
32 > "select[s] a specific instance of localization data within a single
33 > category". Which I think does not match what we want with "valencia"
34 > variant of the "ca" language.
35
36 As I understand it:
37
38 1. Gettext documentation says that locale names can be LL_CC or
39 LL_CC@VARIANT. The natural mapping to the (implementation defined)
40 format mentioned by POSIX seems to be that LL, CC, and VARIANT
41 correspond to language, territory, and modifier, respectively.
42
43 2. Language codes are taken from ISO 639, namely the two-letter code
44 if one exists, otherwise the three-letter code.
45
46 3. Territory codes are taken from ISO 3166-1, usually the two-letter
47 country codes.
48
49 4. According to Gettext documentation, "'@VARIANT' can denote any kind
50 of characteristics that is not already implied by the language LL and
51 the country CC." (So IIUC the BCP-47 variant "valencia" would become
52 "@valencia".)
53
54 > Hence I think POSIX locale cannot handle Catalan Valencian, unless
55 > territory is made accept ISO3166-2 region subdivisions.
56
57 I haven't found any mention or usage of ISO 3166-2 region subdivisions
58 in the context of locale. Can you provide any references for this?
59
60 Ulrich

Replies

Subject Author
Re: [gentoo-dev] News item: LINGUAS USE_EXPAND renamed to L10N Ulrich Mueller <ulm@g.o>
Re: [gentoo-dev] News item: LINGUAS USE_EXPAND renamed to L10N "Chí-Thanh Christopher Nguyễn" <chithanh@g.o>