1 |
>>>>> On Mon, 6 Jun 2016, Ulrich Mueller wrote: |
2 |
|
3 |
>>>>> On Mon, 6 Jun 2016, Chí-Thanh Christopher Nguyễn wrote: |
4 |
>> I'm not totally convinced yet. |
5 |
>> Following the BCP-47 spec the format is |
6 |
|
7 |
>> Language-Tag = langtag ; normal language tags |
8 |
>> langtag = language |
9 |
>> ["-" script] |
10 |
>> ["-" region] |
11 |
>> *("-" variant) |
12 |
>> *("-" extension) |
13 |
>> ["-" privateuse] |
14 |
|
15 |
>> [...] |
16 |
|
17 |
> As I understand it: |
18 |
|
19 |
> 1. Gettext documentation says that locale names can be LL_CC or |
20 |
> LL_CC@VARIANT. The natural mapping to the (implementation defined) |
21 |
> format mentioned by POSIX seems to be that LL, CC, and VARIANT |
22 |
> correspond to language, territory, and modifier, respectively. |
23 |
|
24 |
> 2. Language codes are taken from ISO 639, namely the two-letter code |
25 |
> if one exists, otherwise the three-letter code. |
26 |
|
27 |
> 3. Territory codes are taken from ISO 3166-1, usually the two-letter |
28 |
> country codes. |
29 |
|
30 |
> 4. According to Gettext documentation, "'@VARIANT' can denote any |
31 |
> kind of characteristics that is not already implied by the language |
32 |
> LL and the country CC." (So IIUC the BCP-47 variant "valencia" would |
33 |
> become "@valencia".) |
34 |
|
35 |
Of course, we could also say that Gettext/POSIX syntax (especially its |
36 |
variant/modifier part) is ill-defined, and use BCP-47 syntax for the |
37 |
L10N USE_EXPAND instead (except that the separator would be an |
38 |
underscore instead of a hyphen). |
39 |
|
40 |
AFAICS, there would be no change at all for any of the LL or LL_CC |
41 |
entries. The only ones that would change would be the (about 10) ones |
42 |
containing an @ sign. For example, ca@valencia would become |
43 |
ca_valencia, and sr@ijekavianlatin would become sr_Latn_ijekavsk. |
44 |
|
45 |
Not sure how much additional code for remapping would be required. |
46 |
However, my impression is that upstream usage of @VARIANT is not at |
47 |
all standardised, so some remapping would be required in any case if |
48 |
we want unique entries for L10N. |
49 |
|
50 |
Ulrich |