1 |
On Mon, Jan 23, 2006 at 07:44:44PM +0100, Marius Mauch wrote: |
2 |
> On Mon, 23 Jan 2006 03:47:11 -0800 |
3 |
> Brian Harring <ferringb@g.o> wrote: |
4 |
> > On Mon, Jan 23, 2006 at 11:16:03AM +0100, Marius Mauch wrote: |
5 |
> > > On Wed, 11 Jan 2006 12:39:03 -0800 |
6 |
> > > Brian Harring <ferringb@g.o> wrote: |
7 |
> > > |
8 |
> > > > Regex you've got there allows for pulling the wrong text- recall, |
9 |
> > > > ebd originally was doing grep based filtering (regex). Had to |
10 |
> > > > rewrite that in a major hurry since bash syntax (specifically |
11 |
> > > > here ops) forces you to track state/constructs rather then just a |
12 |
> > > > regex... |
13 |
> > > |
14 |
> > > Not really an issue in this case. First the code bails out if more |
15 |
> > > than one match is found, so unless the metadata assignment is NOT |
16 |
> > > found by it we don't get the wrong info. |
17 |
> > > Also a mismatch in this special is so |
18 |
> > > extremely unlikely that honestly I don't really care about it, |
19 |
> > > especially as this is a one time conversion (might be different if |
20 |
> > > I'd have added the on the fly extraction). |
21 |
> > |
22 |
> > Re-read that statement. It's a one time conversion- meaning we |
23 |
> > better get it right the first time, else the user's data is |
24 |
> > effectively corrupted. Forcing a full regen from the saved |
25 |
> > environment is not a solution for fixing past corruptions either. |
26 |
> > |
27 |
> > If it were on the fly extraction, I wouldn't care quite as much- but |
28 |
> > the fact this is an untracked change to the users data means we *do* |
29 |
> > need to cover corner cases. |
30 |
> |
31 |
> Can't follow your thinking here. As said, the code won't corrupt any |
32 |
> data, at worst it will tell the user that it couldn't extract some keys |
33 |
> (and even that only if there would be such a stupid case which itself |
34 |
> has a chance of like 10^-10 or so, or can you name a single case?). |
35 |
|
36 |
Yeah, any extension of your code to pick up EAPI is going to false |
37 |
positive on somewhere around a quarter of my vdb nodes- I use a |
38 |
debug function from my bashrc for prefix testing (holdover from eapi |
39 |
testing days). Why? Because the eapi var didn't exist (thus no env |
40 |
assignment), the only possible match is in the middle of a here op- |
41 |
|
42 |
debug_dump() { |
43 |
... |
44 |
echo <<<HERE >&2 |
45 |
EAPI="${EAPI-unset}" |
46 |
PF="$PF" |
47 |
CATEGORY="$CATEGORY" |
48 |
HERE |
49 |
... |
50 |
} |
51 |
|
52 |
Setting EAPI to unset strikes me as corruption here, because now |
53 |
portage will *never* do anything with that vdb node- the eapi differs |
54 |
from what it supports (it's been set to effectively a random value), |
55 |
thus I have to fix the vdb entry myself if I ever want to get rid of |
56 |
it. |
57 |
|
58 |
Guessing the response on that one is "well, you're using the bashrc", |
59 |
and yes, I am- the vdb code should be *robust*, not choking on a users |
60 |
debug code. |
61 |
|
62 |
> So what package contains filter-env then? |
63 |
|
64 |
Portage- been in cvs/svn for over a year now :) |
65 |
|
66 |
> > Aside from that, if the code is in debate (as this is), I really |
67 |
> > don't think it should get slid into svn 2 weeks later effectively |
68 |
> > unchanged- didn't write that original email just for the hell of it :) |
69 |
> |
70 |
> As said, I disagree with your assessment of the situation. If you can |
71 |
> name a single case where the code "breaks" or filter-env hits the tree |
72 |
> I might reconsider, but not before. |
73 |
|
74 |
Well, if you disagreed with the original response, continue the |
75 |
conversation prior to commiting- otherwise we see a commit, then a |
76 |
rebuttal a few hours later. Not really how things should go for a |
77 |
contested piece of code (at least when the only two to weigh in our |
78 |
flat out opposed on it)- especially if the code's effect is nontrivial |
79 |
and it hasn't had any actual peer review (only comment was on your algo). |
80 |
|
81 |
Issues with the code. |
82 |
1) Scans for metadata that didn't exist at the time of the ebuild's |
83 |
installation, only possible match is unintended. Not good. |
84 |
2) your code is breaking upon *all* newlines, regardless if it's |
85 |
in the middle of a variable assignment. Yes, bash does use $'' for |
86 |
assignment, but you're relying (hoping really) that the variable has |
87 |
been filtered by python portage and the ebuild defined var is stomped. |
88 |
This is a bad idea- description comes to mind. Your code seems to be |
89 |
aware of this possibility also, since it's doing rather loose quote |
90 |
filtering. |
91 |
3) Stop using regex all over the place. split/join is a helluva lot |
92 |
faster, and collapses that nasty regex that attempts to remove |
93 |
whitespace down to two lines (a one time translate, and the split/join |
94 |
to kill whitespace). |
95 |
4) Code specifically tries to find, and remove variable chars- |
96 |
this is *really* the wrong thing to do. If you're trying to work |
97 |
around catching a var reference, either your parsing screwed up, or |
98 |
you're mangling a DESCRIPTION value that you shouldn't be hosing. |
99 |
5) hardcoded vdb. |
100 |
6) Non root aware VDB. |
101 |
7) Only is aware of environment.bz2- older portage versions differed |
102 |
in this afaik (I *do* know PORT_ENV_FILE overloading for ebd I had to |
103 |
address this case already, so it's out there). |
104 |
8) perms on the new files? |
105 |
9) general code comments; 'if s != ""', use 'if s:'. Catch the |
106 |
exceptions for check mode, don't let vdbkeyhandler nuke world file |
107 |
checking due to a potential exception. Right now, your code will TB |
108 |
if ran by non-root for a check run rather then stating "got to be |
109 |
root". |
110 |
|
111 |
In general, in looking through the parsing here there is a *lot* of |
112 |
chunks that look questionable- the scan for a var assignment for |
113 |
example, should *not* be stripping quotes on it's own. The only |
114 |
benefit of that stripping is allowing for false positives (hardly a |
115 |
benefit). |
116 |
|
117 |
Aside from code commentary, the real brass tacks here is that this |
118 |
code if used is going to be expanded for more then the trivial vars it |
119 |
handles now- in other words, this problematic parser will wind up |
120 |
being used months down the line for important vars with intrinsic issues |
121 |
still unaddressed. |
122 |
|
123 |
And yes, what I'm pointing at is a corner case- that doesn't mean |
124 |
we cut corners on the solution *especially* when a proper tool |
125 |
exists for this already. No tool? I'd be a bit quieter, but the |
126 |
issues *are* resolved already, you just need to ditch the adhoc regex |
127 |
attempt (or write your own state aware parser). It's a 20 minutes mod |
128 |
to fix my initial complaints on this- it might seem pointless, but the |
129 |
potential is there as is a fix, so address it. |
130 |
|
131 |
~harring |