1 |
On Thu, 2003-11-13 at 09:10, Toby Dickenson wrote: |
2 |
> Ive not used ucs4 python yet, but it is one of the things I was looking |
3 |
> forward to in version 2.3. It would much nicer to leave ucs2 behind. |
4 |
|
5 |
I would like to move away from UCS2 as well, but I'd like some arguments |
6 |
to say why this is a good thing apart from "it's more compatible.". |
7 |
|
8 |
> If ucs4 strings were the only cause of that difference, supybot would need to |
9 |
> be storing 2.5 million unicode characters. I guess that isnt likely. |
10 |
> Excluding bugs, I dont see any reason why a program that doesnt use any |
11 |
> unicode objects would use more memory when running on a ucs4 python |
12 |
> interpreter. |
13 |
|
14 |
All unicode string objects would have been stored in UCS4 instead of |
15 |
UCS2. Things like XML parsers all use unicode string objects to store |
16 |
their representations because UTF-8 is the default encoding for XML. |
17 |
Those sorts of applications may have a more significant memory |
18 |
footprint growth. |
19 |
|
20 |
> > But note that this example is not scientific |
21 |
> > because the machines were different in kernel version, compiler and |
22 |
> > compiler optimisations. |
23 |
> |
24 |
> Those reasons sound much more plausibe to me. Does anyone have a more |
25 |
> scientific comparison of the effect of the ucs4 option on python? |
26 |
|
27 |
I'd like to do that some time. Otherwise, someone with a faster machine |
28 |
than mine may want to try it. It would be an interesting to see what the |
29 |
real impact is. If the memory footprint doesn't grow as much as I claims |
30 |
it does, then it is a powerful argument for moving to UCS4 as default. |
31 |
|
32 |
The reason why UCS2 is still default in the masked python-2.3.2 is |
33 |
because (a) not many people use anything at the moment that requires |
34 |
anything above UCS2 and (b) UCS4 does take up more memory compared to |
35 |
the UCS2. How much more, I'm not certain. |
36 |
|
37 |
For instance, how much more memory would portage take if it doesn't use |
38 |
unicode strings at all? |
39 |
|
40 |
Cheers, |
41 |
-- |
42 |
Alastair 'liquidx' Tse |
43 |
>> Gentoo Developer |
44 |
>> http://www.liquidx.net/ | http://dev.gentoo.org/~liquidx/ |