1 |
Jan Kundrát wrote: |
2 |
|
3 |
> Steve Long wrote: |
4 |
>>> Is [[:alpha:]] locale-safe? |
5 |
>>> |
6 |
>> Yes, all POSIX character classes listed here are: |
7 |
>> http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html |
8 |
> |
9 |
> Thanks for a nice link. If I read section 7.3.1 correctly, [[:alpha:]] |
10 |
> always contains those letters, but might contain more, depending on the |
11 |
> locale. So it's probably very minor point, but as long as the script |
12 |
> runs with user-provided locale, one should be explicit here. Or am I |
13 |
> missing something here? |
14 |
> |
15 |
No, that's about the size of it-- if you you'd like to tie it to ASCII, |
16 |
irrespective of locale, that's fair enough. It depends on what you're up to |
17 |
(sorry, I don't have time to go digging through code to see what this |
18 |
applies to) but in the /general/ case it's better to use locale-neutral |
19 |
character-classes, since it makes scripts much more useful. |
20 |
|
21 |
Setting LC_ALL=C temporarily is not a good idea, since it overrides |
22 |
LC_CTYPE. The most common usage for that is for sort order; where that's |
23 |
needed it's better to use LC_COLLATE. (man 7 locale) |
24 |
|
25 |
Quick example showing why the double-bracket appears: |
26 |
if [[ ${v:0:1} != [[:alpha:]_] || $v = *[^[:alnum:]_]* ]]; then |
27 |
errMsg $"$v is not a valid identifier" |
28 |
return 1 |
29 |
fi |
30 |
|
31 |
Getting to know these is really helpful, imo, especially since they apply to |
32 |
_all_ the utilities like tr, sed, grep, awk -- and ed ofc ;) |
33 |
|
34 |
$"text" is for i18n in bash via gettext, akin to _"foo" in C. Can't say I've |
35 |
used it yet, though :p |
36 |
|
37 |
|
38 |
-- |
39 |
gentoo-dev@g.o mailing list |