Gentoo Archives: gentoo-user

From: Matthias Bethke <matthias@×××××××.de>
To: gentoo-user@l.g.o
Subject: [gentoo-user] UTF-8 troubles
Date: Thu, 30 Nov 2006 23:10:01
Message-Id: 20061130224413.GH21644@huxley
1 I switched a few systems to all-UTF-8 a while ago, and while it's
2 generally a big improvement, a few apps are playing up. Pretty common
3 apps that is, most notably tin and centericq, so I think it's probably
4 my problem.
5 Thing is, tin seems to decode messages correctly and tries to show
6 umlauts. However, I only see the lowercase ä, ö and ü; the uppercase
7 versions and the German "sharp s" (ß) are garbled. The latter for
8 example is displayed as a diamond with a question mark inside
9 (supposedly indicating "invalid UTF sequence") followed by "~_" (0x7e
10 0x5f---the correct UTF-8 sequence is 0xc3 0x9f). Centericq is similar; I
11 see all umlauts I type in the input area as two question marks, but the
12 lowercase ones get transmitted correctly and I can read others'
13 lowercase umlauts. No capitals, no ß either.
14 The only distinction I could make out between the sets of characters that
15 are displayed correctly and those that aren't is that the latter contain
16 UTF-8 bytes that would not be printable when interpreted as ISO-8859-x,
17 so my hypothesis is that something in-between the app's text output and
18 the terminal eats bytes unless they're deemed "printable".
19 The affected programs all seem to use ncurses. I couldn't find anything
20 in terminfo that could be causing this, but then I don't have much of a
21 clue about terminfo in the first place. Google doesn't seem to hvae
22 heard of the problem. Any ideas where I could look?
23
24 cheers!
25 Matthias
26 --
27 I prefer encrypted and signed messages. KeyID: FAC37665
28 Fingerprint: 8C16 3F0A A6FC DF0D 19B0 8DEF 48D9 1700 FAC3 7665

Replies

Subject Author
Re: [gentoo-user] UTF-8 troubles "Bo Ørsted Andresen" <bo.andresen@××××.dk>