Gentoo Archives: gentoo-user

From:	Matthias Bethke <matthias@×××××××.de>
To:	gentoo-user@l.g.o
Subject:	[gentoo-user] UTF-8 troubles
Date:	Thu, 30 Nov 2006 23:10:01
Message-Id:	`20061130224413.GH21644@huxley`

1	I switched a few systems to all-UTF-8 a while ago, and while it's
2	generally a big improvement, a few apps are playing up. Pretty common
3	apps that is, most notably tin and centericq, so I think it's probably
4	my problem.
5	Thing is, tin seems to decode messages correctly and tries to show
6	umlauts. However, I only see the lowercase ä, ö and ü; the uppercase
7	versions and the German "sharp s" (ß) are garbled. The latter for
8	example is displayed as a diamond with a question mark inside
9	(supposedly indicating "invalid UTF sequence") followed by "~_" (0x7e
10	0x5f---the correct UTF-8 sequence is 0xc3 0x9f). Centericq is similar; I
11	see all umlauts I type in the input area as two question marks, but the
12	lowercase ones get transmitted correctly and I can read others'
13	lowercase umlauts. No capitals, no ß either.
14	The only distinction I could make out between the sets of characters that
15	are displayed correctly and those that aren't is that the latter contain
16	UTF-8 bytes that would not be printable when interpreted as ISO-8859-x,
17	so my hypothesis is that something in-between the app's text output and
18	the terminal eats bytes unless they're deemed "printable".
19	The affected programs all seem to use ncurses. I couldn't find anything
20	in terminfo that could be causing this, but then I don't have much of a
21	clue about terminfo in the first place. Google doesn't seem to hvae
22	heard of the problem. Any ideas where I could look?
23
24	cheers!
25	Matthias
26	--
27	I prefer encrypted and signed messages. KeyID: FAC37665
28	Fingerprint: 8C16 3F0A A6FC DF0D 19B0 8DEF 48D9 1700 FAC3 7665

Replies

Subject	Author
Re: [gentoo-user] UTF-8 troubles	"Bo Ørsted Andresen" <bo.andresen@××××.dk>

Report Message

Find on MARC Find on Google Groups