1 |
On Tue, Nov 22, 2005 at 09:41:46PM +0100, Marius Mauch wrote: |
2 |
> On Fri, 18 Nov 2005 18:40:14 -0600 |
3 |
> Brian Harring <ferringb@g.o> wrote: |
4 |
> |
5 |
> > On Wed, Nov 16, 2005 at 04:33:02AM +0100, Marius Mauch wrote: |
6 |
> > > - digestfn = |
7 |
> > > mysettings["FILESDIR"]+"/digest-"+mysettings["PF"] |
8 |
> > > + digestfn = |
9 |
> > > portdb.finddigest(mysettings["CATEGORY"]+"/"+mysettings["PF"]) |
10 |
> > Like this mod personally, but don't like the '/'. You're passing in |
11 |
> > two chunks of data, pass them in seperate imo, rather then collapsing |
12 |
> > it for every call. |
13 |
> |
14 |
> Not sure I understand this, you mean changing finddigest() to use two |
15 |
> params? Doesn't really make sense to me, should just work for any CPV |
16 |
> entry. |
17 |
|
18 |
Well, I'd personally prefer passing in an object, but that's |
19 |
probably not going to occur ;-) |
20 |
|
21 |
using two params deviates from what we do in other funcs, but it also |
22 |
removes hardcoding of path seperator. For windows support (yes it's |
23 |
nuts, but bear with) it would have to split and rejoin the path, |
24 |
converting from '/' to os.path.sep. |
25 |
|
26 |
Pass it in as two seperated components, it can join it internally on |
27 |
it's own (if/when we N deep categories, it'll have to split that param |
28 |
though). |
29 |
|
30 |
Debatable obviously, I just dislike the requirement of knowing a heck |
31 |
of a lot more about the func prototype then is required- |
32 |
finddigest(category, package, version) vs |
33 |
finddigest(category+"/"+package+"-"+version) fex. One I can figure |
34 |
out from pydoc, the other I'll probably dig up an example in the code |
35 |
to see if it's cp or cpv. |
36 |
|
37 |
|
38 |
> > Kind of disliking the 'calledFor' being thrown around here, plus the |
39 |
> > v2 check, then resorting to a seperate method of checking. |
40 |
> > Integrate 'em potentially? |
41 |
> > |
42 |
> > You're basically treating it as two seperate funcs (from how I look |
43 |
> > at it), should split it up (even if the seperated funcs call each |
44 |
> > other, or a common func). |
45 |
> |
46 |
> Yeah, was just going the least intrusive route for now, so keep the API |
47 |
> compatible. This wasn't intended for inclusion yet anyway, just some |
48 |
> "sneak preview" ;) |
49 |
|
50 |
Iron out the implementation a bit, and I personally wouldn't have |
51 |
issues with it going into svn. |
52 |
|
53 |
Helluva lot easier to hack on it out of a repo, then via a set of |
54 |
patches anyways :) |
55 |
|
56 |
> > > @@ -2218,7 +2229,7 @@ |
57 |
> > > return 1 |
58 |
> > > |
59 |
> > > |
60 |
> > > -def digestParseFile(myfilename): |
61 |
> > > +def digestParseFile(myfilename, calledFor="manifest"): |
62 |
> > > """(filename) -- Parses a given file for entries matching: |
63 |
> > > MD5 MD5_STRING_OF_HEX_CHARS FILE_NAME FILE_SIZE |
64 |
> > > Ignores lines that do not begin with 'MD5' and returns a |
65 |
> > > @@ -2230,6 +2241,10 @@ |
66 |
> > > |
67 |
> > > mydigests={} |
68 |
> > > for x in mylines: |
69 |
> > > + tmp = portage_manifest.convertManifest2ToDigest(x, |
70 |
> > > calledAs=calledFor) |
71 |
> > > + if tmp[0] == True: |
72 |
> > > + mydigests[tmp[1]] = tmp[2] |
73 |
> > > + continue |
74 |
> > > myline=string.split(x) |
75 |
> > > if len(myline) < 4: |
76 |
> > > #invalid line |
77 |
> > > @@ -2258,7 +2273,10 @@ |
78 |
> > > the basedir given. Returns 1 only if all files exist and |
79 |
> > > match the md5s. """ |
80 |
> > > for x in myfiles: |
81 |
> > > - if not mydigests.has_key(x): |
82 |
> > > + if |
83 |
> > > portage_manfest.getManifestVersion(basedir+"/Manifest") == 2 and |
84 |
> > > x.startswith("files/digest-"): |
85 |
> > hardcoding sucks, kthnx |
86 |
> > If possible, try to shove it into the func so modification (down the |
87 |
> > line) occurs in one spot, rather then having to resort to |
88 |
> > search/replace :) |
89 |
> |
90 |
> Shove what into which func? I'm bad at guessing today. |
91 |
|
92 |
Well, you're hardcoding /Manifest for all calls to it. |
93 |
|
94 |
Kind of an indication the api should be changed and internally handle |
95 |
it imo, or encapsulate in some other fashion. |
96 |
|
97 |
Yes it's anal, but new additions to portage api I'm after having saner |
98 |
functions in use rather then what's there now. Saner is relative |
99 |
obviously, but assume you get what I mean. |
100 |
|
101 |
> > > Index: pym/portage_checksum.py |
102 |
> > > =================================================================== |
103 |
> > > --- pym/portage_checksum.py (revision 2312) |
104 |
> > > +++ pym/portage_checksum.py (working copy) |
105 |
> > > @@ -126,3 +126,12 @@ |
106 |
> > > portage_locks.unlockfile(mylock) |
107 |
> > > |
108 |
> > > return (myhash,mysize) |
109 |
> > > + |
110 |
> > > + |
111 |
> > > +hashfunc_map = {"MD5": md5hash, "SHA1": sha1hash} |
112 |
> > > + |
113 |
> > > +def perform_multiple_checksums(filename, hashes=["MD5"], |
114 |
> > > calc_prelink=0): |
115 |
> > > + rVal = {} |
116 |
> > > + for x in hashes: |
117 |
> > > + rVal[x] = perform_checksum(filename, |
118 |
> > > hashfunc_map[x], calc_prelink)[0] |
119 |
> > > + return rVal |
120 |
> > Not sure if I like the [0] tbh. |
121 |
> > Still think the size test should be broken off as it's own pseudo |
122 |
> > hashfunc, and implemented in that way. Course, would need to weight |
123 |
> > it somehow, so that size test is done first, then the more expensive |
124 |
> > chf tests... |
125 |
> |
126 |
> Yep, see above about "least intrusive". |
127 |
|
128 |
I can live with intrusive if the new code is better quality then |
129 |
existing- stable code sucks, if it needs to be slapped around a bit to |
130 |
play nice with new code, works for me. Chunking size off so it's no |
131 |
longer a special case makes sense in my opinion, since otherwise |
132 |
each/every chksum func will have to do it's own stat and return size |
133 |
just to maintain a similar return to md5. |
134 |
|
135 |
|
136 |
> > > --- pym/portage_manifest.py (revision 0) |
137 |
> > > +++ pym/portage_manifest.py (revision 0) |
138 |
> > > @@ -0,0 +1,163 @@ |
139 |
> > > +import os |
140 |
> > > + |
141 |
> > > +import portage, portage_exception, portage_checksum, |
142 |
> > > portage_versions + |
143 |
> > > +class Manifest2Entry: |
144 |
> > > + def __init__(self, entrytype, name, config, |
145 |
> > > methods=portage_checksum.get_valid_checksum_keys()): |
146 |
> > The methods trick you're doing is a one time evaluation... not sure |
147 |
> > if that's the best approach tbh. |
148 |
> > |
149 |
> > > + if entrytype not in ["EBUILD", "AUXFILE", |
150 |
> > > "MISCFILE", "SRCURI"]: |
151 |
> > > + raise |
152 |
> > > portage_exception.PortageException("unknown Manifest entry type: |
153 |
> > > %s" % entrytype) |
154 |
> > break that out into a class constant, and please kill off the capps |
155 |
> > (they kill kittens, or something). |
156 |
> |
157 |
> Ok on the constant, but I like the caps here ;) |
158 |
|
159 |
Constants kill kittens. |
160 |
|
161 |
You don't want to kill kittens, do you? ;) |
162 |
|
163 |
> > > + def generate(self, pkgdir, genOldStyle=True): |
164 |
> > > + if self.type == "EBUILD" or self.type == |
165 |
> > > "MISCFILE": |
166 |
> > > + filename = pkgdir+"/"+self.name |
167 |
> > > + elif self.type == "AUXFILE": |
168 |
> > > + filename = pkgdir+"/files/"+self.name |
169 |
> > > + elif self.type == "SRCURI": |
170 |
> > > + filename = |
171 |
> > > self.config["DISTDIR"]+"/"+self.name |
172 |
> > > + if not os.path.exists(filename): |
173 |
> > > + raise |
174 |
> > > portage_exception.PortageException("file for Manifest entry doesn't |
175 |
> > > exist: %s" % filename) |
176 |
> > > + |
177 |
> > > + self.size = os.stat(filename).st_size |
178 |
> > > + self.checksums = |
179 |
> > > portage_checksum.perform_multiple_checksums(filename, self.methods) |
180 |
> > See comment about making size into a pseudo chf func- would eliminate |
181 |
> > this case. |
182 |
> > Chunking this up into type handlers might be wise- would make it |
183 |
> > easier to extend (modify a class dict rather then N funcs). You've |
184 |
> > already got the seperated beast, just throw it into a mapping and |
185 |
> > it'll be nice and pretty. |
186 |
> |
187 |
> Yeah, though somehow having a function just for filename composition |
188 |
> seems a bit overkill. Don't mind either way. |
189 |
Eliminates a special case though; see comment above about alternative |
190 |
chf funcs having to return a list instead of the chksum, and why this |
191 |
should be done for the long haul should be clear :) |
192 |
|
193 |
> > > + cs_list = [x+" "+self.checksums[x] for x in |
194 |
> > > self.checksums.keys()] |
195 |
> > > + self.entry = " ".join([self.type, self.name, |
196 |
> > > str(self.size)]+cs_list) |
197 |
> > > + if genOldStyle and self.type != "SRCURI": |
198 |
> > > + if "MD5" in self.methods: |
199 |
> > > + oldentry = |
200 |
> > > Manifest2Entry._makeOldEntry(pkgdir, filename, size=self.size, |
201 |
> > > md5sum=self.checksums["MD5"]) |
202 |
> > > + else: |
203 |
> > > + oldentry = |
204 |
> > > Manifest2Entry._makeOldEntry(pkgdir, filename, size=self.size) |
205 |
> > Kinda iffy on this approach also, seems unclean (quite subjective I |
206 |
> > realize). |
207 |
> |
208 |
> Better ideas welcome. |
209 |
Seperated class potentially, although kind of overkill. |
210 |
|
211 |
Could potentially make it easier to seperate new and old though, so |
212 |
that old can be thrown out easier down the line (instead of an |
213 |
invasive removal of code from the combined class). |
214 |
|
215 |
Haven't looked at the code since the first review, so the paragraph |
216 |
above could be mildly insane. :) |
217 |
|
218 |
> > > + |
219 |
> > > + if debug: |
220 |
> > > + print "pkgdir:", pkgdir |
221 |
> > > + |
222 |
> > > + cpv_list = [pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] |
223 |
> > > for x in portage.listdir(pkgdir) if x.endswith(".ebuild")] |
224 |
> > os.path.sep... |
225 |
> |
226 |
> Cosmetics. |
227 |
Readability/maintainability ;) |
228 |
|
229 |
> > > + |
230 |
> > > + return (0, None) |
231 |
> > Eh? |
232 |
> > Exceptions seem like a better route here then magic constants and |
233 |
> > string error msgs. |
234 |
> |
235 |
> Eh, I don't return error messages but the name of the object that |
236 |
> caused the failure. Wether to use exceptions for this comes down on who |
237 |
> is going to call this, don't really see the point in using exceptions |
238 |
> here as I expect only two or three callers (namely repoman and ebuild) |
239 |
> which have to act on the return code directly. |
240 |
|
241 |
Those two or three callers are going to have some funky voodoo in |
242 |
place that a year down the line, I'm going to scratch my head over |
243 |
(since I'm an idiot). |
244 |
|
245 |
You're passing data back and flagging an error, exception is really |
246 |
the route to go. It's extensible mainly, such that if the '2 or 3' |
247 |
becoms '5 or 6' we don't have magic constants/voodoo spreading through |
248 |
the code. |
249 |
|
250 |
> > > +def convertManifest2ToDigest(manifestline, calledAs="manifest"): |
251 |
> > > + mysplit = manifestline.split() |
252 |
> > > + if mysplit[0] not in ["EBUILD", "AUXFILE", "MISCFILE", |
253 |
> > > "SRCURI"] or ((len(mysplit) - 3) % 2) != 0: |
254 |
> > > + return (False, None, None) |
255 |
> > > + if mysplit[0] == "SRCURI" and calledAs == "manifest": |
256 |
> > > + return (False, None, None) |
257 |
> > > + if mysplit[0] in ["EBUILD", "AUXFILE", "MISCFILE"] and |
258 |
> > > calledAs == "digest": |
259 |
> > > + return (False, None, None) |
260 |
> > > + |
261 |
> > > + checksums = {} |
262 |
> > > + |
263 |
> > > + mytype = mysplit.pop(0) |
264 |
> > > + filename = mysplit.pop(0) |
265 |
> > > + if mytype == "AUXFILE": |
266 |
> > > + filename = "files/"+filename |
267 |
> > > + size = long(mysplit.pop(0)) |
268 |
> > > + checksums = dict(zip(mysplit[::2], mysplit[1::2])) |
269 |
> > > + if not checksums.has_key("size"): |
270 |
> > > + checksums["size"] = size |
271 |
> > > + |
272 |
> > > + return (True, filename, checksums) |
273 |
> > > + |
274 |
> > > +def getManifestVersion(manifest): |
275 |
> > > + rVal = 0 |
276 |
> > > + |
277 |
> > > + if not os.path.exists(manifest): |
278 |
> > > + return rVal |
279 |
> > > + myfile = open(manifest, "r") |
280 |
> > > + lines = myfile.readlines() |
281 |
> > > + for l in lines: |
282 |
> > > + mysplit = l.split() |
283 |
> > > + if mysplit[0] == "MD5" and len(mysplit) == 4 and |
284 |
> > > rVal == 0: |
285 |
> > > + rVal = 1 |
286 |
> > > + elif mysplit[0] in ["EBUILD", "AUXFILE", |
287 |
> > > "MISCFILE", "SRCURI"] and ((len(mysplit) - 3) % 2) == 0 and rVal in |
288 |
> > > [0,1]: |
289 |
> > > + rVal = 2 |
290 |
> > > + return rVal |
291 |
> > |
292 |
> > Offhand... I really think you need to do some subclassing above. |
293 |
> > Flow's mostly there, but a lot of icky stuff for long term |
294 |
> > maintenance (imo). Seems like a refactoring of the patch would bring |
295 |
> > line count down, and up readability. |
296 |
> |
297 |
> Maybe, but I'm no such an OO maniac and also suck at refactoring ;) |
298 |
Might want to see what you can raid from saviors portage.fs.* |
299 |
|
300 |
Yes it's extra, but the fs objs will wind up in a stable portage |
301 |
sooner or later for the contents benefits, and in this case the |
302 |
encapsulation might be useful for you. |
303 |
~harring |