Gentoo Archives: gentoo-portage-dev

From: Brian Dolbec <dolsen@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH 1/2] dblink: case insensitive support for bug #524236
Date: Sun, 16 Nov 2014 18:11:26
Message-Id: 20141116101115.6c61ce58.dolsen@gentoo.org
In Reply to: [gentoo-portage-dev] [PATCH 1/2] dblink: case insensitive support for bug #524236 by Zac Medico
1 On Sun, 16 Nov 2014 02:41:54 -0800
2 Zac Medico <zmedico@g.o> wrote:
3
4 > This adds dblink._contents_contains, _contents_iter, and
5 > _contents_key methods that provide an interface for contents
6 > operations with "implicit" case handling.
7 >
8 > X-Gentoo-Bug: 524236
9 > X-Gentoo-Url: https://bugs.gentoo.org/show_bug.cgi?id=524236
10 > ---
11 > pym/portage/dbapi/vartree.py | 43
12 > +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43
13 > insertions(+)
14 >
15 > diff --git a/pym/portage/dbapi/vartree.py
16 > b/pym/portage/dbapi/vartree.py index 8b06f4c..2d1003f 100644
17 > --- a/pym/portage/dbapi/vartree.py
18 > +++ b/pym/portage/dbapi/vartree.py
19 > @@ -1515,6 +1515,8 @@ class dblink(object):
20 > self.contentscache = None
21 > self._contents_inodes = None
22 > self._contents_basenames = None
23 > + self._contents_case_insensitive = None
24 > + self._contents_case_reverse_map = None
25
26 I like that you've prefixed the new variables with _contents_.
27
28 I also see there already are a number of _contents_ prefixed variables
29 and now newly added functions below.
30
31
32
33 > self._linkmap_broken = False
34 > self._device_path_map = {}
35 > self._hardlink_merge_map = {}
36 > @@ -1526,6 +1528,15 @@ class dblink(object):
37 > # compliance with RESTRICT=preserve-libs.
38 > self._preserve_libs = "preserve-libs" in
39 > mysettings.features
40 > + if "case-insensitive-fs" in self.settings.features:
41 > + self._contents_key =
42 > self._contents_key_case_insensitive
43 > + self._contents_contains =
44 > self._contents_contains_case_insensitive
45 > + self._contents_iter =
46 > self._contents_iter_case_insensitive
47 > + else:
48 > + self._contents_key =
49 > self._contents_key_case_sensitive
50 > + self._contents_contains =
51 > self._contents_contains_case_sensitive
52 > + self._contents_iter =
53 > self._contents_iter_case_sensitive +
54 > def __hash__(self):
55 > return hash(self._hash_key)
56 >
57 > @@ -1612,6 +1623,8 @@ class dblink(object):
58 > self.contentscache = None
59 > self._contents_inodes = None
60 > self._contents_basenames = None
61 > + self._contents_case_insensitive = None
62 > + self._contents_case_reverse_map = None
63 >
64 > def getcontents(self):
65 > """
66 > @@ -1716,6 +1729,36 @@ class dblink(object):
67 > self.contentscache = pkgfiles
68 > return pkgfiles
69 >
70 > + def _contents_case_insensitive_init(self):
71 > + self._contents_case_insensitive = dict(
72 > + (k.lower(), v) for k, v in
73 > self.getcontents().items())
74 > + self._contents_case_reverse_map = dict(
75 > + (k.lower(), k) for k in self.getcontents())
76 > +
77 > + def _contents_iter_case_insensitive(self):
78 > + if self._contents_case_insensitive is None:
79 > + self._contents_case_insensitive_init()
80 > + return iter(self._contents_case_insensitive)
81 > +
82 > + def _contents_key_case_insensitive(self, key):
83 > + if self._contents_case_reverse_map is None:
84 > + self._contents_case_insensitive_init()
85 > + return self._contents_case_reverse_map[key]
86 > +
87 > + def _contents_contains_case_insensitive(self, key):
88 > + if self._contents_case_insensitive is None:
89 > + self._contents_case_insensitive_init()
90 > + return key.lower() in self._contents_case_insensitive
91 > +
92 > + def _contents_key_case_sensitive(self, key):
93 > + return key
94 > +
95 > + def _contents_iter_case_sensitive(self):
96 > + return iter(self.getcontents())
97 > +
98 > + def _contents_contains_case_sensitive(self, key):
99 > + return key in self.getcontents()
100 > +
101 > def _prune_plib_registry(self, unmerge=False,
102 > needed=None, preserve_paths=None):
103 > # remove preserved libraries that don't have any
104 > consumers left
105
106
107 These truly have little interaction with the rest of the dblink class.
108 I would like to see them moved to an independent class. With that move
109 the names could be shortened without the _contents prefix since they
110 would be Contents class referenced. It also looks like there should be
111 additional code from the dblink class moved into this new class.
112 Primarily the getcontents(), but that is a much larger refactor. That
113 would also require the cache, re's and probably some others. Then all
114 the references to these functions in the vardbapi to the class
115 instance instead.
116
117 On that note, I propose we put this new code in a new class (avoiding
118 adding to the bloat), pass in the getcontents() pointer along with some
119 others needed. In later commits do the major re-factor moving more
120 contents specific handling code into the new class. Bare in mind I
121 have not looked that extensively in dblink and vardbapi, but it does
122 look doable.
123
124 Are you up for the challenge? ;)
125 I think it would certainly make the code more maintainable, less
126 daunting and more clearly defined. We could even split vartree.py into
127 some smaller files or sub-pkg. I dislike multi-thousand lines of code
128 files and classes. It makes it much more difficult to move around and
129 keep track of interactions between the code (complexity) and to learn
130 everything it does.
131
132 --
133 Brian Dolbec <dolsen>

Replies