Gentoo Archives: gentoo-user

From: "Sebastian Günther" <samson@××××××××××××××××.de>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Kernel update messed up console encoding
Date: Sat, 28 Feb 2009 18:48:09
Message-Id: 20090228184805.GA7841@marvin.heimnetz.local
In Reply to: Re: [gentoo-user] Kernel update messed up console encoding by "Florian v. Savigny"
1 * Florian v. Savigny (lorian@××××××××.de) [28.02.09 18:39]:
2 >
3 > Hi Sebastian,
4 >
5 > > > But Emacs displays the lower-case umlauts followed by a space
6 > > > etc. etc. ...
7 >
8 > > what does file say about the offending files?
9 >
10 > I was not actually talking about files when I mentioned Emacs, but
11 > what I see when I *type* into Emacs (such as in this mail
12 > message). But in case you mean what that produces when I save the
13 > result of what I typed into a file, I ran a few tests, and the results
14 > were mixed:
15 >
16 > For the 3 lower-case umlauts, file reports UTF-8, consistent with the
17 > number of bytes (i.e. the file length): 3 characters, 6 bytes. The hex
18 > representation of the 6 bytes is: c3 a4 c3 b6 c3 3c.
19 >
20 > For the three upper-case umlauts and for the eszett, file reports
21 > iso-8859, also consistent with the number of bytes: 3 characters, 3
22 > bytes. The code position is, however, definitely wrong: it is always
23 > hex c3 (which would be the upper-case A tilde in iso-8859-1, and four
24 > different letters can hardly have the same code position.)
25 >
26 > To me this looks as if Emacs puts the first half of the byte sequences
27 > (always the hex c3) into the buffer, while trying to interpret the
28 > other half (see list below) as a command: it will say something like
29 > "\204 is undefined". I am quite certain \nnn is an octal number.
30 >
31 > eszett: \237 (hex 9f, dec 159)
32 > A uml: \204 (hex 84, dec 132)
33 > O uml: \226 (hex 96, dec 150)
34 > Uuml: \234 (hex 9c, dec 156)
35 >
36 > If I am right, the keys thus send:
37 >
38 > eszett: c3 9f
39 > A uml: c3 84
40 > O uml: c3 96
41 > U uml: c3 9c
42 > a uml: c3 a4
43 > o uml: c3 b6
44 > u uml: c3 3c
45 >
46 > I would assume that these sequences are the UTF-8 representation of
47 > the respective characters (but I don't have a table to figure that
48 > out).
49 >
50 > Sorry if the whole thing was diffcult to follow. I should perhaps have
51 > mentioned that for the upper-case umlauts and the eszett, Emacs not
52 > only complains, but also inputs an "unknown" character into the
53 > buffer, represented by a '?' in reverse video. That's apparently the
54 > hex c3 byte.
55 >
56 That is a problem of the consolefont, since the console can't display it
57 with cp1250...
58
59
60 > > Emacs always uses the enconding of the file, where as an redirect
61 > > uses the locale, iirc.
62 >
63 > I know; normally it can figure it out - I think this ability is not
64 > compromised in any way (I can e.g. open an XML file encoded in utf-8,
65 > and will see "11u" in the mode line). Also, please note that under X,
66 > Emacs behaves completely as before.
67 >
68 > By "redirect", you mean shell redirection? Does that do any character
69 > conversion?
70
71 yes.
72
73 echo "äöüÄÖÜß" > console.test
74 then write the same in emacs and save as emacs.test.
75
76 And then compare the output of
77
78 file console.test
79 and
80 file emace.test
81
82 If there are differences, somewhere here lies the Problem
83
84 >
85 > > I assume you know the options->mule menu in emacs, there is a lot to
86 > > help with encoding issues...
87 >
88 > Yes, I know, but I don't see how set-input-method would fix this. Do you?
89 >
90 No but set-coding-system for saving the file might help to achieve the
91 right encoding.
92
93 > > > As to the locale, where can I look that up ... ?
94 > > .bashrc
95 >
96 > Neither ~/.bashrc nor /etc/bash/bashrc contain any locale setting
97 > ... hmm.
98
99 locale
100 should shown it to you
101
102 >
103 > But very frankly, would the solution not focus on the kernel, at least
104 > partly? As I said, I can reverse the phenomenon by simply booting the
105 > old kernel!
106 >
107 > Does nobody know where the kernel controls what the keys of the
108 > console keyboard send when pressed?
109 >
110 > (BTW, KEYMAP="de-latin1-nodeadkeys", in /etc/conf.d/keymaps.)
111
112 Exactly there.
113
114 >
115 > Regards, Florian
116 >
117 >
118 >
119
120 Sebastian
121
122 --
123 " Religion ist das Opium des Volkes. " Karl Marx
124
125 SEB@STI@N GÜNTHER mailto:samson@××××××××××××××××.de

Replies

Subject Author
Re: [gentoo-user] Kernel update messed up console encoding "Florian v. Savigny" <lorian@××××××××.de>