Gentoo Archives: gentoo-dev

From: Alastair Tse <liquidx@g.o>
To: gentoo-dev@g.o
Subject: Re: [gentoo-dev] python-2.3.2 testing required
Date: Thu, 13 Nov 2003 09:51:50
Message-Id: 1068717088.25166.47.camel@huggins.eng.cam.ac.uk
In Reply to: Re: [gentoo-dev] python-2.3.2 testing required by Toby Dickenson
1 On Thu, 2003-11-13 at 09:10, Toby Dickenson wrote:
2 > Ive not used ucs4 python yet, but it is one of the things I was looking
3 > forward to in version 2.3. It would much nicer to leave ucs2 behind.
4
5 I would like to move away from UCS2 as well, but I'd like some arguments
6 to say why this is a good thing apart from "it's more compatible.".
7
8 > If ucs4 strings were the only cause of that difference, supybot would need to
9 > be storing 2.5 million unicode characters. I guess that isnt likely.
10 > Excluding bugs, I dont see any reason why a program that doesnt use any
11 > unicode objects would use more memory when running on a ucs4 python
12 > interpreter.
13
14 All unicode string objects would have been stored in UCS4 instead of
15 UCS2. Things like XML parsers all use unicode string objects to store
16 their representations because UTF-8 is the default encoding for XML.
17 Those sorts of applications may have a more significant memory
18 footprint growth.
19
20 > > But note that this example is not scientific
21 > > because the machines were different in kernel version, compiler and
22 > > compiler optimisations.
23 >
24 > Those reasons sound much more plausibe to me. Does anyone have a more
25 > scientific comparison of the effect of the ucs4 option on python?
26
27 I'd like to do that some time. Otherwise, someone with a faster machine
28 than mine may want to try it. It would be an interesting to see what the
29 real impact is. If the memory footprint doesn't grow as much as I claims
30 it does, then it is a powerful argument for moving to UCS4 as default.
31
32 The reason why UCS2 is still default in the masked python-2.3.2 is
33 because (a) not many people use anything at the moment that requires
34 anything above UCS2 and (b) UCS4 does take up more memory compared to
35 the UCS2. How much more, I'm not certain.
36
37 For instance, how much more memory would portage take if it doesn't use
38 unicode strings at all?
39
40 Cheers,
41 --
42 Alastair 'liquidx' Tse
43 >> Gentoo Developer
44 >> http://www.liquidx.net/ | http://dev.gentoo.org/~liquidx/

Attachments

File name MIME type
signature.asc application/pgp-signature