1 |
On Fri, Jun 03, 2022 at 09:14:05AM +0200, Ulrich Mueller wrote: |
2 |
> >>>>> On Tue, 31 May 2022, Ionen Wolkens wrote: |
3 |
> |
4 |
> > +# @FUNCTION: esed |
5 |
> > +# @USAGE: <sed-argument>... |
6 |
> > +# @DESCRIPTION: |
7 |
> > +# sed(1) wrapper that dies if the expression(s) did not modify any files. |
8 |
> > +# sed's -i/--in-place is forced, and so stdin/out cannot be used. |
9 |
> |
10 |
> This sounds like a simple enough task ... |
11 |
> |
12 |
> > +esed() { |
13 |
> > + local -i i |
14 |
> > + |
15 |
> > + if [[ ${esedexps@a} =~ a ]]; then |
16 |
> > + # expression must be before -- but after the rest for e.g. -E to work |
17 |
> > + local -i pos |
18 |
> > + for ((pos=1; pos<=${#}; pos++)); do |
19 |
> > + [[ ${!pos} == -- ]] && break |
20 |
> > + done |
21 |
> > + |
22 |
> > + for ((i=0; i<${#esedexps[@]}; i++)); do |
23 |
> > + [[ ${esedexps[i]} ]] && |
24 |
> > + esedexps= esed "${@:1:pos-1}" -e "${esedexps[i]}" "${@:pos}" |
25 |
> > + done |
26 |
> > + |
27 |
> > + unset esedexps |
28 |
> > + return 0 |
29 |
> > + fi |
30 |
> > + |
31 |
> > + # Roughly attempt to find files in arguments by checking if it's a |
32 |
> > + # readable file (aka s/// is not a file) and does not start with - |
33 |
> > + # (unless after --), then store contents for comparing after sed. |
34 |
> > + local contents=() endopts files=() |
35 |
> > + for ((i=1; i<=${#}; i++)); do |
36 |
> > + if [[ ${!i} == -- && ! -v endopts ]]; then |
37 |
> > + endopts=1 |
38 |
> > + elif [[ ${!i} =~ ^(-i|--in-place)$ && ! -v endopts ]]; then |
39 |
> > + # detect rushed sed -i -> esed -i, -i also silently breaks enewsed |
40 |
> > + die "passing ${!i} to ${FUNCNAME[0]} is invalid" |
41 |
> > + elif [[ ${!i} =~ ^(-f|--file)$ && ! -v endopts ]]; then |
42 |
> > + i+=1 # ignore script files |
43 |
> > + elif [[ ( ${!i} != -* || -v endopts ) && -f ${!i} && -r ${!i} ]]; then |
44 |
> > + files+=( "${!i}" ) |
45 |
> > + |
46 |
> > + # eval 2>/dev/null to silence \0 warnings if sed binary files |
47 |
> > + eval 'contents+=( "$(<"${!i}")" )' 2>/dev/null \ |
48 |
> > + || die "failed to read: ${!i}" |
49 |
> > + fi |
50 |
> > + done |
51 |
> > + (( ${#files[@]} )) || die "no readable files found from '${*}' arguments" |
52 |
> > + |
53 |
> > + local verbose |
54 |
> > + [[ ${ESED_VERBOSE} ]] && type diff &>/dev/null && verbose=1 |
55 |
> > + |
56 |
> > + local changed newcontents |
57 |
> > + if [[ -v _esed_output ]]; then |
58 |
> > + [[ -v verbose ]] && |
59 |
> > + einfo "${FUNCNAME[0]}: sed ${*} > ${_esed_output} ..." |
60 |
> > + |
61 |
> > + sed "${@}" > "${_esed_output}" \ |
62 |
> > + || die "failed to run: sed ${*} > ${_esed_output}" |
63 |
> > + |
64 |
> > + eval 'newcontents=$(<${_esed_output})' 2>/dev/null \ |
65 |
> > + || die "failed to read: ${_esed_output}" |
66 |
> > + |
67 |
> > + local IFS=$'\n' # sed concats with newline even if none at EOF |
68 |
> > + contents=${contents[*]} |
69 |
> > + unset IFS |
70 |
> > + |
71 |
> > + [[ ${contents} != "${newcontents}" ]] && changed=1 |
72 |
> > + |
73 |
> > + [[ -v verbose ]] && |
74 |
> > + diff -u --color --label="${files[*]}" --label="${_esed_output}" \ |
75 |
> > + <(echo "${contents}") <(echo "${newcontents}") |
76 |
> > + else |
77 |
> > + [[ -v verbose ]] && einfo "${FUNCNAME[0]}: sed -i ${*} ..." |
78 |
> > + |
79 |
> > + sed -i "${@}" || die "failed to run: sed -i ${*}" |
80 |
> > + |
81 |
> > + for ((i=0; i<${#files[@]}; i++)); do |
82 |
> > + eval 'newcontents=$(<"${files[i]}")' 2>/dev/null \ |
83 |
> > + || die "failed to read: ${files[i]}" |
84 |
> > + |
85 |
> > + if [[ ${contents[i]} != "${newcontents}" ]]; then |
86 |
> > + changed=1 |
87 |
> > + [[ -v verbose ]] || break |
88 |
> > + fi |
89 |
> > + |
90 |
> > + [[ -v verbose ]] && |
91 |
> > + diff -u --color --label="${files[i]}"{,} \ |
92 |
> > + <(echo "${contents[i]}") <(echo "${newcontents}") |
93 |
> > + done |
94 |
> > + fi |
95 |
> > + |
96 |
> > + [[ -v changed ]] \ |
97 |
> > + || die "no-op: ${FUNCNAME[0]} ${*}${_esed_command:+ (from: ${_esed_command})}" |
98 |
> > +} |
99 |
> |
100 |
> ... but then it's almost 100 lines of shell code, including very |
101 |
> convoluted parsing of parameters. The code for detection whether a |
102 |
> parameter is actually a file is a heuristic at best and looks rather |
103 |
|
104 |
Even if it does pickup sed -e "s_im-a_real-file_", it's a semi |
105 |
non-issue given the file won't change before and after and it'll |
106 |
evaluate the others for difference instead. |
107 |
|
108 |
Where it gets a bit more complicated is enewsed given this is valid: |
109 |
sed s/// file1 file2 > file3 |
110 |
Of course, could always ban multiple files on enewsed if felt really |
111 |
needed, albeit fwiw it's a limitation for a very unlikely scenario. |
112 |
|
113 |
"--" support is probably what adds the most complexity here, but |
114 |
will have a problem if any ebuild ever needs to sed a -file |
115 |
(-- is also useful in esedfind fwiw) |
116 |
|
117 |
Also I did write tests so the minor complexity hopefully doesn't |
118 |
lead to regressions. |
119 |
|
120 |
iwdevtools' qa-sed uses getopt(long) instead, but even that is not |
121 |
perfect given sed has special rules (plus wouldn't want to depend |
122 |
on || ( getopt util-linux ) here). |
123 |
|
124 |
> brittle. Also, don't use eval because it is evil. |
125 |
|
126 |
This is static evals, they're just evaluating a flat string and it's |
127 |
no different than.. well just running it as-is except it works around |
128 |
a noise issue (the 2>/dev/null doesn't register without it). |
129 |
|
130 |
eval is mostly evil when it's: |
131 |
eval "${expanded_variable}=${what_is_this_even}" |
132 |
|
133 |
Seeing eval "var=\${not_expanded}" as "evil" makes no sense to me, |
134 |
it's a harmless warning silencer. I feel this is just overreaction |
135 |
to the eval keyword (also in other dev ML post). |
136 |
|
137 |
To reproduce the warning: |
138 |
$ var=$(printf "\0") # var=$(<file-with-null-bytes) |
139 |
bash: warning: command substitution: ignored null byte in input |
140 |
|
141 |
doesn't work: var=$(printf "\0") 2>/dev/null |
142 |
works: eval 'var=$(printf "\0")' 2>/dev/null |
143 |
|
144 |
> |
145 |
> So IMHO this isn't the right approach to the problem. If anything, make |
146 |
> it a simple function with well defined arguments, e.g. exactly one |
147 |
> expression and exactly one file, and call "sed -i" on it. |
148 |
|
149 |
I was hoping it wouldn't be a limitation to using sed, aka just add |
150 |
checks and take very little away. Would've even liked to do 100% compat |
151 |
like qa-sed (transparent stdin/out support), but that one is more |
152 |
complicated and drew the line. |
153 |
|
154 |
Lacking multiple files on esed (not enewsed) would be kinda annoying |
155 |
I think. esedfind also makes use of the fact it can take multiple |
156 |
files for when people need things like: |
157 |
|
158 |
find . -name '*.c' -exec sed -i s/__AVX__/__AVX2__/ {} + || die |
159 |
(esedfind would die if not one macro was replaced) |
160 |
|
161 |
I tried to use esed in several ebuilds and see what felt needed or |
162 |
is annoying without hoping people wouldn't mind using it over sed. |
163 |
|
164 |
Goal here is not to replace sed nor encourage using it, but just add |
165 |
a guard to ensure changes happened if it does get used. I've seen |
166 |
plenty of ebuilds with a sed that's been broken for >5-10 years. |
167 |
|
168 |
> |
169 |
> Then again, we had something similar in the past ("dosed" as a package |
170 |
> manager command) but later banned it for good reason. |
171 |
|
172 |
Well, dosed /used/ to be kinda useful to avoid needing: |
173 |
|
174 |
mv file "${T}"/tmpfile || die |
175 |
sed s/// "${T}"/tmpfile > file || die |
176 |
|
177 |
It's only later that dosed started using -i itself and mostly didn't |
178 |
have a purpose anymore given ebuilds could now use -i too, see: |
179 |
|
180 |
https://bugs.gentoo.org/152017 |
181 |
|
182 |
> |
183 |
> Detection whether the file contents changed also seems complicated. |
184 |
> Why not compute a hash of the file before and after sed operated on it? |
185 |
> md5sum or even cksum should be good enough, since security isn't an |
186 |
> issue here. |
187 |
|
188 |
I don't think this would simplify beside now calling more external |
189 |
commands twice then comparing the outputs (capturing command output vs |
190 |
capturing file is not particularly different). May be faster on large |
191 |
amount of files but probably not meaningful for single/few files. |
192 |
|
193 |
This may make sed concatenation more complicated to handle too (cat |
194 |
doesn't concatenate the same way sed does wrt newlines at EOF). Not |
195 |
that couldn't ban that on enewsed, although it's more limitations |
196 |
over not much. |
197 |
|
198 |
> |
199 |
|
200 |
All this aside, I'm not adamant about adding this. If not enough |
201 |
interest then I'll just drop the idea. May try to implement multiple |
202 |
expression support in iwdevtools' qa-sed instead, albeit being a |
203 |
separate QA thing that doesn't know the maintainer's intend (aka |
204 |
which seds are deterministic or not) and may also be missed, it |
205 |
doesn't really ensure ebuilds' seds are kept up to date. |
206 |
|
207 |
-- |
208 |
ionen |