1 |
On Sun, Mar 30, 2003 at 01:13:24AM -0800, George Shapovalov wrote: |
2 |
> On Saturday 29 March 2003 23:51, Robin H. Johnson wrote: |
3 |
> > I'm going to write some C++ or Java instead for what I want. Python |
4 |
> > seems to lack the Sets and other sequence datatypes that would be |
5 |
> > greatly useful. |
6 |
> Well, not sure what are you referring to, but python sure does have sequence |
7 |
> types and sets. In fact all sequence types provide set abilities. |
8 |
|
9 |
> You might want to take a look here: |
10 |
> http://www.python.org/doc/current/lib/typesseq.html |
11 |
> and here: |
12 |
> http://www.python.org/doc/current/tut/node7.html |
13 |
I looked at those originally when learning python properly earlier |
14 |
today. |
15 |
|
16 |
My definitions: |
17 |
A LIST is a group of unordered and possibly duplicate items. As in |
18 |
LINKED LIST. Commonly O(n) time to access any item. O(n/2) or better if |
19 |
cleverly implemented. O(n) time for add/remove/contains/size. O(1) for |
20 |
add/remove (after search). |
21 |
|
22 |
An ARRAY is a LIST with additional pointers to each element allowing |
23 |
constant time O(1) access to each element. O(n) time for arbitrary |
24 |
add/remove. |
25 |
|
26 |
A SET is a group of items that are all unique and are accessed in |
27 |
naturally sorted order. O(1) time for access/add/remove/contains/size. |
28 |
|
29 |
A MAP is defined as a SET with each element being the key to another |
30 |
data structure. (Python provides this as a dictionary). Same big-O as |
31 |
SET. |
32 |
|
33 |
A BIMAP is best explained as a pair of SETs doubly linked in MAPs. It is |
34 |
MAP that can be used in both directions, with significent memory, speed |
35 |
and consistancy improvements, over just using two MAPs. |
36 |
An explaination of BIMAP: |
37 |
A is the set of (A1,A2,A3) |
38 |
B is the set of (B1,B2,B3) |
39 |
BIMAP.parents() might be: {A1=>(B1,B3),A2=>(B2),A3=(B2,B3)} |
40 |
BIMAP.children() would then be: {B1=>(A1),B2=>(A2,A3),B3=(A1,A3)} |
41 |
The actual data of A and B is stored seperately and references to them |
42 |
are used since each item will appear at least twice. |
43 |
BIMAPs offer the major advantage that an element from set of data can be |
44 |
used as a key to obtain related items in constant amortized time. |
45 |
O(1) for almost all operations. |
46 |
|
47 |
Hashtable implementation of SET/MAP/BIMAP is cruicial for high |
48 |
performance. |
49 |
|
50 |
Python's implementation of a MAP (dictionary) implemented via hash |
51 |
tables, but I can't find anything on tweaking the load factor and |
52 |
initial capacity, which makes a significent difference on data sets of |
53 |
large but known sizes. |
54 |
|
55 |
For these structures, various iterators are required as well, as this is |
56 |
the best way to traverse the data. Functional programming methods to |
57 |
apply to the data is also required, such as (Python provides these first |
58 |
three) filter, map, reduce, (Python doesn't seem to have these last |
59 |
three) set union, set difference, set intersection. |
60 |
|
61 |
For a SET, I considered using MAP and just ignoring the data field, but |
62 |
that seems like a terrible waste of cpu and memory. I find it really |
63 |
strange that Python doesn't have a SET while it does have a MAP, esp. |
64 |
given that MAPs are a logical extension of SETs. |
65 |
|
66 |
It seems Python did at one stage consider adding SETS: |
67 |
http://www.python.org/peps/pep-0218.html |
68 |
But I can't make much sense of the outcome. |
69 |
|
70 |
It seems I would have to implement BIMAP myself, since there isn't one |
71 |
in Python that I can find at all. I would admit that it is significently |
72 |
less common data structure than a list/map/set. I did have to write my |
73 |
own Java implementation of it, but it is available in other functional |
74 |
programming languages (Mercury provides a very good example). |
75 |
|
76 |
BIMAP is the main structure that I would be using, but SET and MAP also |
77 |
to a slightly lesser degree. I want to create a BIMAP containing the |
78 |
relative paths to each ebuild as one key, and the IUSE data from that |
79 |
ebuild as the other key. |
80 |
|
81 |
-- |
82 |
Robin Hugh Johnson |
83 |
E-Mail : robbat2@××××××××××××××.net |
84 |
Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 |
85 |
ICQ# : 30269588 or 41961639 |
86 |
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 |