1 |
On 06/08/2013 23:42, Stroller wrote: |
2 |
> |
3 |
> On 6 August 2013, at 14:04, Kerin Millar wrote: |
4 |
>> ... |
5 |
>> If undefined, the value of LC_COLLATE is inherited from LANG. I'm not sure that overriding it is particularly useful nowadays but it doesn't hurt. |
6 |
> |
7 |
> It's been a couple of years since I looked into this, but I'm given to believe that LANG should set all LC_ variables correctly, and that overriding them is frowned upon. |
8 |
|
9 |
As has been mentioned, there are valid reasons to want to override the |
10 |
collation. Here is a concrete example: |
11 |
|
12 |
https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00537.html |
13 |
|
14 |
Strictly speaking, grep is correct to behave that way but it can be |
15 |
confounding. In an ideal world, everyone would be using named classes |
16 |
instead of ranges in their regular expressions but it's not an ideal world. |
17 |
|
18 |
These days, grep no longer exhibits this characteristic in Gentoo. |
19 |
Nevertheless, it serves as a valid example of how collations for UTF-8 |
20 |
locales can be a liability. |
21 |
|
22 |
Of the other distros, Arch Linux also defined LC_COLLATE=C although I |
23 |
understand that they have just recently stopped doing that. |
24 |
|
25 |
On a production system, I would still be inclined to use it for reasons |
26 |
of safety. For that matter, some people refuse to use UTF-8 at all on |
27 |
the grounds of security; the handling of variable-width encodings |
28 |
continues to be an effective bug inducer. |
29 |
|
30 |
> I had to do this myself because, due to a bug, the en_GB time formatting failed to display am or pm. I believe this should be fixed now. |
31 |
|
32 |
Presumably: |
33 |
|
34 |
a) LANG was defined inappropriately |
35 |
b) LANG was defined appropriately but LC_TIME was defined otherwise |
36 |
c) LC_ALL was defined, trumping all |
37 |
|
38 |
I would definitely not advise doing any of these things. |
39 |
|
40 |
--Kerin |