Gentoo Archives: gentoo-dev

From:	Mike Frysinger <vapier@g.o>
To:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] [RFC] ban use of base-4 casemods in ebuilds due to locale collation instability
Date:	Wed, 11 Nov 2015 07:43:00
Message-Id:	`20151111074250.GX5154@vapier.lan`
In Reply to:	Re: [gentoo-dev] [RFC] ban use of base-4 casemods in ebuilds due to locale collation instability by Ulrich Mueller

1	On 11 Nov 2015 05:16, Ulrich Mueller wrote:
2	> >>>>> On Tue, 10 Nov 2015, Mike Frysinger wrote:
3	>
4	> > Arfrever highlights these are not even safe to use. bash is locale aware,
5	> > so it'll apply LC_COLLATE rules when processing the ^/, casemods. while
6	> > you can fix this with external programs ala:
7	> > LC_COLLATE=C tr ...
8	>
9	> > you can't do it with inline code like:
10	> > LC_COLLATE=C SRC_URI=".../${PN^^}/..."
11	>
12	> >>>>> On Tue, 10 Nov 2015, Mike Frysinger wrote:
13	>
14	> > sorry, i meant char classification here (LC_CTYPE), not collation.
15	>
16	> Shouldn't these be safe to use if the string consists purely of ASCII
17	> characters? I mean, A-Z and a-z should be uppercase and lowercase,
18	> respectively, in any locale?
19
20	nope. it depends on the order of the chars in the locale and assumes
21	the first is A and the last is Z. which not all do.
22	$ echo {a..z} \| LC_ALL=et_EE.UTF-8 sed 's:[a-z]::g'
23	t u v w x y
24
25	we could do something like the classic:
26	tolower() { tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' <<<"$*"; }
27
28	but that would still would not help with the bash builtins.
29	-mike

File name	MIME type
signature.asc	application/pgp-signature