Gentoo Archives: gentoo-dev

From: Stanislav Brabec <utx@×××××××.cz>
To: Spider <spider@g.o>
Cc: gentoo-dev@g.o
Subject: Re: [gentoo-dev] better handling of multibyte characters (nls/cjk/unicode)
Date: Wed, 05 Nov 2003 22:11:48
Message-Id: 1068069900.8669.71.camel@utx.utx.cz
In Reply to: Re: [gentoo-dev] better handling of multibyte characters (nls/cjk/unicode) by Spider
1 V St, 05. 11. 2003 v 14:50, Spider pí¹e:
2
3 > Ahh, thats very true. Except that I'd still want it documented. Yep,
4 > I'm having a drive for UTF-8 right now, I've ran into places where I
5 > need filenames in UTF-8 and most things have started to break otherwise.
6 > unpleasant compability problems.
7
8 These problems are currently discussed in glib bugzilla:
9 http://bugzilla.gnome.org/show_bug.cgi?id=114068
10
11
12 And there are my ideas, what needs to be done for seamless UTF-8
13 support:
14
15 0) baselayout: Add "official" way to set system default locale:
16 /etc/rc.conf or /etc/env.d/{number}locale (or similar).
17
18 1) glibc: Generate UTF-8 locales for all languages, if unicode USE flag
19 is on. Either as patch, or ex-post:
20 for i in ...... ; do
21 localedef -i $i -f UTF-8 -u charids.894 $i.UTF-8
22 done
23
24 2) Enable UTF-8 locales in GDM:
25
26 Two alternatives:
27
28 # Create UTF-8 for all locales.
29 sed 's:\([^ ]*\)\([ ][ ]*\)\(.*\)UTF-8,\(.*\):\1 \2\4,\3UTF-8\
30 \1(UTF-8)\2\3UTF-8,\4:g' <config/locale.alias | sed 's:^\([^ ][^ ][^
31 ][^ ][^ ][^ ][^ ][^ ]\)*(UTF-8):& :' >config/locale.alias~
32 mv config/locale.alias~ config/locale.alias
33
34 # Prefer non UTF-8 for all locales.
35 sed 's:\([^ ]*\)\([ ][ ]*\)\(.*\)UTF-8,\(.*\):\1 \2\4,\3UTF-8:g'
36 <config/locale.alias | sed 's:^\([^ ][^ ][^ ][^ ][^ ][^ ][^ ][^
37 ]\)*(UTF-8):& :' >config/locale.alias~
38 mv config/locale.alias~ config/locale.alias
39
40 3) Grab Redhat patches for most problematic applications (mc, slang,
41 maybe ncurses).
42
43
44
45 Setting language mini HOWTO:
46
47
48 LANG
49
50 LANG sets overall NLS support. Use LANG in form cs_CZ, cs_CZ.UTF-8 or
51 cs_CZ.UTF-8@variant. Avoid cs (X does not like it), cs_CZ.utf8 (again -
52 X doesn't like it) and czech (X and some old apps does not like it -
53 very old gtk). Test: run LANG={mysetup} xcalc. No warning must appear.
54
55 Note: localedef --list shows cs_CZ.utf8. It's because glibc ignores all
56 '-' characters and converts charset to lowercase.
57
58
59 LANGUAGE
60
61 LANGUAGE can be optionally set to list of languages for searching
62 messages, e. g. cs_CZ:sk_SK. Applications translated to cs will use cs,
63 applications translated to sk but not cs will use sk.
64
65 Warning: Some desktop managers with ability to set session language
66 (GDM) does not properly set/reset this variable and it is inherited to
67 all locale setups. It can cause strange setups.
68
69
70 LC_{category} - Set it for special purposes, if you want to use
71 different locales for messages, ctypes, collating etc.
72
73 LC_ALL - no need to set in environment. It nearly duplicates LANG. Nice
74 to use in scripts, not environment.
75
76 --
77 Stanislav Brabec
78 http://www.penguin.cz/~utx, ICQ 116020046
79
80 --
81 gentoo-dev@g.o mailing list