Gentoo Archives: gentoo-dev

From: Alastair Tse <liquidx@g.o>
To: gentoo-dev@g.o
Subject: Re: [gentoo-dev] python-2.3.2-r2 changes
Date: Sat, 22 Nov 2003 23:34:02
Message-Id: 1069544021.2893.40.camel@huggins.eng.cam.ac.uk
In Reply to: Re: [gentoo-dev] python-2.3.2-r2 changes by Aron Griffis
1 On Sat, 2003-11-22 at 21:08, Aron Griffis wrote:
2 > Could you give a quick run-down on the difference between UCS2 and UCS4
3 > and what this change buys us? If it's been discussed in a thread which
4 > I missed, a pointer to the archived thread would be sufficient.
5 >
6
7 This was the initial thread:
8
9 http://article.gmane.org/gmane.linux.gentoo.devel/13751
10
11 Anyway, just a quick run down. Basically, Gentoo's python-2.2 used UCS2
12 as default for the unicode internal representation. That means using a
13 16bit word for each unicode character. There were bugs with 2.2's UCS4
14 implementation though. They have been fixed in 2.3 and it is mature
15 enough to be used as standard.
16
17 UCS4 is pretty popular standard for implementing unicode. For example,
18 glib has no UCS2 support and only supports UCS4. Another reason for
19 sticking with UCS4 is that it is recommended by the Python devs and is
20 being adopted by the other distros like Redhat (>=9) and Debian
21 (unstable)[3] for python-2.3. In fact, as far as I know, wchar in Linux
22 defaults to 4 bytes anyway.
23
24 I initially had doubts about UCS4 but from my tests[1], unless an
25 application uses unicode extensively, the memory footprint doesn't grow.
26 For example, emerge took pretty much the same memory (actually 160k
27 less).
28
29 So in the long run, I think aligning ourselves with UCS4 support in
30 Python will decrease the hassles in the future. For a more professional
31 (and detailed) treatment of the subject, you might like to read PEP261
32 [2].
33
34 Hope that answers your questions.
35
36 Cheers,
37
38 Alastair
39
40
41 [1] http://article.gmane.org/gmane.linux.gentoo.devel/13842/match=python
42 [2]
43 http://www.python.org/peps/pep-0261.html
44 [3] http://mail.python.org/pipermail/python-dev/2003-June/036458.html
45 > Thanks,
46 > Aron
47 --
48 Alastair 'liquidx' Tse
49 >> Gentoo Developer
50 >> http://www.liquidx.net/ | http://dev.gentoo.org/~liquidx/

Attachments

File name MIME type
signature.asc application/pgp-signature