1 |
> I don't know if this has improved over the years, but my initial |
2 |
> experience with unicode was rather negative. The fact that text |
3 |
> files were twice as large wasn't a major problem in itself. The |
4 |
> real showstopper was that importing text files into spreadsheets |
5 |
> and text-editors and word processors failed miseraby. |
6 |
> |
7 |
> I looked at a unicode text file with a binary viewer. It turns out |
8 |
> that a simple text string like "1234" was actually... |
9 |
> "1" binary-zero "2" binary-zero "3" binary-zero "4" binary zero, etc. |
10 |
|
11 |
That's (as someone has already pointed out) UTF-16, which is the default for |
12 |
some Windows tools (but understood in Linux too). (Even UTF-32 exists where |
13 |
all characters are 4 byte wide, but I've never seen it in the wild.) |
14 |
|
15 |
UTF-8 is normally used on Linux (and ASCII chars look exactly the same there); |
16 |
even for "long characters" outside the ASCII range spreadsheets and word |
17 |
processors should not be a problem anymore. |
18 |
|
19 |
-- |
20 |
Andreas K. Hüttel |
21 |
dilfridge@g.o |
22 |
Gentoo Linux developer |
23 |
(council, qa, toolchain, base-system, perl, libreoffice) |