1 |
On Sat, 2003-11-22 at 21:08, Aron Griffis wrote: |
2 |
> Could you give a quick run-down on the difference between UCS2 and UCS4 |
3 |
> and what this change buys us? If it's been discussed in a thread which |
4 |
> I missed, a pointer to the archived thread would be sufficient. |
5 |
> |
6 |
|
7 |
This was the initial thread: |
8 |
|
9 |
http://article.gmane.org/gmane.linux.gentoo.devel/13751 |
10 |
|
11 |
Anyway, just a quick run down. Basically, Gentoo's python-2.2 used UCS2 |
12 |
as default for the unicode internal representation. That means using a |
13 |
16bit word for each unicode character. There were bugs with 2.2's UCS4 |
14 |
implementation though. They have been fixed in 2.3 and it is mature |
15 |
enough to be used as standard. |
16 |
|
17 |
UCS4 is pretty popular standard for implementing unicode. For example, |
18 |
glib has no UCS2 support and only supports UCS4. Another reason for |
19 |
sticking with UCS4 is that it is recommended by the Python devs and is |
20 |
being adopted by the other distros like Redhat (>=9) and Debian |
21 |
(unstable)[3] for python-2.3. In fact, as far as I know, wchar in Linux |
22 |
defaults to 4 bytes anyway. |
23 |
|
24 |
I initially had doubts about UCS4 but from my tests[1], unless an |
25 |
application uses unicode extensively, the memory footprint doesn't grow. |
26 |
For example, emerge took pretty much the same memory (actually 160k |
27 |
less). |
28 |
|
29 |
So in the long run, I think aligning ourselves with UCS4 support in |
30 |
Python will decrease the hassles in the future. For a more professional |
31 |
(and detailed) treatment of the subject, you might like to read PEP261 |
32 |
[2]. |
33 |
|
34 |
Hope that answers your questions. |
35 |
|
36 |
Cheers, |
37 |
|
38 |
Alastair |
39 |
|
40 |
|
41 |
[1] http://article.gmane.org/gmane.linux.gentoo.devel/13842/match=python |
42 |
[2] |
43 |
http://www.python.org/peps/pep-0261.html |
44 |
[3] http://mail.python.org/pipermail/python-dev/2003-June/036458.html |
45 |
> Thanks, |
46 |
> Aron |
47 |
-- |
48 |
Alastair 'liquidx' Tse |
49 |
>> Gentoo Developer |
50 |
>> http://www.liquidx.net/ | http://dev.gentoo.org/~liquidx/ |