Gentoo Archives: gentoo-user

From:	"Florian v. Savigny" <lorian@××××××××.de>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Kernel update messed up console encoding
Date:	Sat, 28 Feb 2009 17:38:59
Message-Id:	`0ML2xA-1LdT9M2LrY-0006L9@mrelayeu.kundenserver.de`
In Reply to:	Re: [gentoo-user] Kernel update messed up console encoding by "Sebastian Günther"

1	Hi Sebastian,
2
3	> > But Emacs displays the lower-case umlauts followed by a space
4	> > etc. etc. ...
5
6	> what does file say about the offending files?
7
8	I was not actually talking about files when I mentioned Emacs, but
9	what I see when I type into Emacs (such as in this mail
10	message). But in case you mean what that produces when I save the
11	result of what I typed into a file, I ran a few tests, and the results
12	were mixed:
13
14	For the 3 lower-case umlauts, file reports UTF-8, consistent with the
15	number of bytes (i.e. the file length): 3 characters, 6 bytes. The hex
16	representation of the 6 bytes is: c3 a4 c3 b6 c3 3c.
17
18	For the three upper-case umlauts and for the eszett, file reports
19	iso-8859, also consistent with the number of bytes: 3 characters, 3
20	bytes. The code position is, however, definitely wrong: it is always
21	hex c3 (which would be the upper-case A tilde in iso-8859-1, and four
22	different letters can hardly have the same code position.)
23
24	To me this looks as if Emacs puts the first half of the byte sequences
25	(always the hex c3) into the buffer, while trying to interpret the
26	other half (see list below) as a command: it will say something like
27	"\204 is undefined". I am quite certain \nnn is an octal number.
28
29	eszett: \237 (hex 9f, dec 159)
30	A uml: \204 (hex 84, dec 132)
31	O uml: \226 (hex 96, dec 150)
32	Uuml: \234 (hex 9c, dec 156)
33
34	If I am right, the keys thus send:
35
36	eszett: c3 9f
37	A uml: c3 84
38	O uml: c3 96
39	U uml: c3 9c
40	a uml: c3 a4
41	o uml: c3 b6
42	u uml: c3 3c
43
44	I would assume that these sequences are the UTF-8 representation of
45	the respective characters (but I don't have a table to figure that
46	out).
47
48	Sorry if the whole thing was diffcult to follow. I should perhaps have
49	mentioned that for the upper-case umlauts and the eszett, Emacs not
50	only complains, but also inputs an "unknown" character into the
51	buffer, represented by a '?' in reverse video. That's apparently the
52	hex c3 byte.
53
54	> Emacs always uses the enconding of the file, where as an redirect
55	> uses the locale, iirc.
56
57	I know; normally it can figure it out - I think this ability is not
58	compromised in any way (I can e.g. open an XML file encoded in utf-8,
59	and will see "11u" in the mode line). Also, please note that under X,
60	Emacs behaves completely as before.
61
62	By "redirect", you mean shell redirection? Does that do any character
63	conversion?
64
65	> I assume you know the options->mule menu in emacs, there is a lot to
66	> help with encoding issues...
67
68	Yes, I know, but I don't see how set-input-method would fix this. Do you?
69
70	> > As to the locale, where can I look that up ... ?
71	> .bashrc
72
73	Neither ~/.bashrc nor /etc/bash/bashrc contain any locale setting
74	... hmm.
75
76	But very frankly, would the solution not focus on the kernel, at least
77	partly? As I said, I can reverse the phenomenon by simply booting the
78	old kernel!
79
80	Does nobody know where the kernel controls what the keys of the
81	console keyboard send when pressed?
82
83	(BTW, KEYMAP="de-latin1-nodeadkeys", in /etc/conf.d/keymaps.)
84
85	Regards, Florian

Replies

Subject	Author
Re: [gentoo-user] Kernel update messed up console encoding	"Sebastian Günther" <samson@××××××××××××××××.de>

Report Message

Find on MARC Find on Google Groups