Gentoo Archives: gentoo-portage-dev

From: Brian Harring <ferringb@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] making aux_get more usable for vardbapi
Date: Mon, 23 Jan 2006 22:09:17
Message-Id: 20060123220800.GA30517@nightcrawler.had1.or.comcast.net
In Reply to: Re: [gentoo-portage-dev] making aux_get more usable for vardbapi by Marius Mauch
1 On Mon, Jan 23, 2006 at 07:44:44PM +0100, Marius Mauch wrote:
2 > On Mon, 23 Jan 2006 03:47:11 -0800
3 > Brian Harring <ferringb@g.o> wrote:
4 > > On Mon, Jan 23, 2006 at 11:16:03AM +0100, Marius Mauch wrote:
5 > > > On Wed, 11 Jan 2006 12:39:03 -0800
6 > > > Brian Harring <ferringb@g.o> wrote:
7 > > >
8 > > > > Regex you've got there allows for pulling the wrong text- recall,
9 > > > > ebd originally was doing grep based filtering (regex). Had to
10 > > > > rewrite that in a major hurry since bash syntax (specifically
11 > > > > here ops) forces you to track state/constructs rather then just a
12 > > > > regex...
13 > > >
14 > > > Not really an issue in this case. First the code bails out if more
15 > > > than one match is found, so unless the metadata assignment is NOT
16 > > > found by it we don't get the wrong info.
17 > > > Also a mismatch in this special is so
18 > > > extremely unlikely that honestly I don't really care about it,
19 > > > especially as this is a one time conversion (might be different if
20 > > > I'd have added the on the fly extraction).
21 > >
22 > > Re-read that statement. It's a one time conversion- meaning we
23 > > better get it right the first time, else the user's data is
24 > > effectively corrupted. Forcing a full regen from the saved
25 > > environment is not a solution for fixing past corruptions either.
26 > >
27 > > If it were on the fly extraction, I wouldn't care quite as much- but
28 > > the fact this is an untracked change to the users data means we *do*
29 > > need to cover corner cases.
30 >
31 > Can't follow your thinking here. As said, the code won't corrupt any
32 > data, at worst it will tell the user that it couldn't extract some keys
33 > (and even that only if there would be such a stupid case which itself
34 > has a chance of like 10^-10 or so, or can you name a single case?).
35
36 Yeah, any extension of your code to pick up EAPI is going to false
37 positive on somewhere around a quarter of my vdb nodes- I use a
38 debug function from my bashrc for prefix testing (holdover from eapi
39 testing days). Why? Because the eapi var didn't exist (thus no env
40 assignment), the only possible match is in the middle of a here op-
41
42 debug_dump() {
43 ...
44 echo <<<HERE >&2
45 EAPI="${EAPI-unset}"
46 PF="$PF"
47 CATEGORY="$CATEGORY"
48 HERE
49 ...
50 }
51
52 Setting EAPI to unset strikes me as corruption here, because now
53 portage will *never* do anything with that vdb node- the eapi differs
54 from what it supports (it's been set to effectively a random value),
55 thus I have to fix the vdb entry myself if I ever want to get rid of
56 it.
57
58 Guessing the response on that one is "well, you're using the bashrc",
59 and yes, I am- the vdb code should be *robust*, not choking on a users
60 debug code.
61
62 > So what package contains filter-env then?
63
64 Portage- been in cvs/svn for over a year now :)
65
66 > > Aside from that, if the code is in debate (as this is), I really
67 > > don't think it should get slid into svn 2 weeks later effectively
68 > > unchanged- didn't write that original email just for the hell of it :)
69 >
70 > As said, I disagree with your assessment of the situation. If you can
71 > name a single case where the code "breaks" or filter-env hits the tree
72 > I might reconsider, but not before.
73
74 Well, if you disagreed with the original response, continue the
75 conversation prior to commiting- otherwise we see a commit, then a
76 rebuttal a few hours later. Not really how things should go for a
77 contested piece of code (at least when the only two to weigh in our
78 flat out opposed on it)- especially if the code's effect is nontrivial
79 and it hasn't had any actual peer review (only comment was on your algo).
80
81 Issues with the code.
82 1) Scans for metadata that didn't exist at the time of the ebuild's
83 installation, only possible match is unintended. Not good.
84 2) your code is breaking upon *all* newlines, regardless if it's
85 in the middle of a variable assignment. Yes, bash does use $'' for
86 assignment, but you're relying (hoping really) that the variable has
87 been filtered by python portage and the ebuild defined var is stomped.
88 This is a bad idea- description comes to mind. Your code seems to be
89 aware of this possibility also, since it's doing rather loose quote
90 filtering.
91 3) Stop using regex all over the place. split/join is a helluva lot
92 faster, and collapses that nasty regex that attempts to remove
93 whitespace down to two lines (a one time translate, and the split/join
94 to kill whitespace).
95 4) Code specifically tries to find, and remove variable chars-
96 this is *really* the wrong thing to do. If you're trying to work
97 around catching a var reference, either your parsing screwed up, or
98 you're mangling a DESCRIPTION value that you shouldn't be hosing.
99 5) hardcoded vdb.
100 6) Non root aware VDB.
101 7) Only is aware of environment.bz2- older portage versions differed
102 in this afaik (I *do* know PORT_ENV_FILE overloading for ebd I had to
103 address this case already, so it's out there).
104 8) perms on the new files?
105 9) general code comments; 'if s != ""', use 'if s:'. Catch the
106 exceptions for check mode, don't let vdbkeyhandler nuke world file
107 checking due to a potential exception. Right now, your code will TB
108 if ran by non-root for a check run rather then stating "got to be
109 root".
110
111 In general, in looking through the parsing here there is a *lot* of
112 chunks that look questionable- the scan for a var assignment for
113 example, should *not* be stripping quotes on it's own. The only
114 benefit of that stripping is allowing for false positives (hardly a
115 benefit).
116
117 Aside from code commentary, the real brass tacks here is that this
118 code if used is going to be expanded for more then the trivial vars it
119 handles now- in other words, this problematic parser will wind up
120 being used months down the line for important vars with intrinsic issues
121 still unaddressed.
122
123 And yes, what I'm pointing at is a corner case- that doesn't mean
124 we cut corners on the solution *especially* when a proper tool
125 exists for this already. No tool? I'd be a bit quieter, but the
126 issues *are* resolved already, you just need to ditch the adhoc regex
127 attempt (or write your own state aware parser). It's a 20 minutes mod
128 to fix my initial complaints on this- it might seem pointless, but the
129 potential is there as is a fix, so address it.
130
131 ~harring

Replies

Subject Author
Re: [gentoo-portage-dev] making aux_get more usable for vardbapi Marius Mauch <genone@g.o>