1 |
On Fri, Jun 03, 2022 at 07:36:46AM -0400, Ionen Wolkens wrote: |
2 |
> ... snip ... |
3 |
> |
4 |
> + # Roughly attempt to find files in arguments by checking if it's a |
5 |
> + # readable file (aka s/// is not a file) and does not start with - |
6 |
> + # (unless after --), then store contents for comparing after sed. |
7 |
> + local contents=() endopts files=() |
8 |
> + for ((i=1; i<=${#}; i++)); do |
9 |
> + if [[ ${!i} == -- && ! -v endopts ]]; then |
10 |
> + endopts=1 |
11 |
> + elif [[ ${!i} =~ ^(-i|--in-place)$ && ! -v endopts ]]; then |
12 |
> + # detect rushed sed -i -> esed -i, -i also silently breaks enewsed |
13 |
> + die "passing ${!i} to ${FUNCNAME[0]} is invalid" |
14 |
> + elif [[ ${!i} =~ ^(-f|--file)$ && ! -v endopts ]]; then |
15 |
> + i+=1 # ignore script files |
16 |
> + elif [[ ( ${!i} != -* || -v endopts ) && -f ${!i} && -r ${!i} ]]; then |
17 |
> + files+=( "${!i}" ) |
18 |
> + |
19 |
> + # 2>/dev/null to silence null byte warnings if sed binary files |
20 |
> + { contents+=( "$(<"${!i}")" ); } 2>/dev/null \ |
21 |
> + || die "failed to read: ${!i}" |
22 |
> + fi |
23 |
> + done |
24 |
> + (( ${#files[@]} )) || die "no readable files found from '${*}' arguments" |
25 |
> + |
26 |
> + local verbose |
27 |
> + [[ ${ESED_VERBOSE} ]] && type diff &>/dev/null && verbose=1 |
28 |
> + |
29 |
> + local changed newcontents |
30 |
> + if [[ -v _esed_output ]]; then |
31 |
> + [[ -v verbose ]] && |
32 |
> + einfo "${FUNCNAME[0]}: sed ${*} > ${_esed_output} ..." |
33 |
> + |
34 |
> + sed "${@}" > "${_esed_output}" \ |
35 |
> + || die "failed to run: sed ${*} > ${_esed_output}" |
36 |
> + |
37 |
> + { newcontents=$(<"${_esed_output}"); } 2>/dev/null \ |
38 |
> + || die "failed to read: ${_esed_output}" |
39 |
> + |
40 |
> + local IFS=$'\n' # sed concats with newline even if none at EOF |
41 |
> + contents=${contents[*]} |
42 |
> + unset IFS |
43 |
> + |
44 |
> + if [[ ${contents} != "${newcontents}" ]]; then |
45 |
> + changed=1 |
46 |
> + |
47 |
> + [[ -v verbose ]] && |
48 |
> + diff -u --color --label="${files[*]}" --label="${_esed_output}" \ |
49 |
> + <(echo "${contents}") <(echo "${newcontents}") |
50 |
> + fi |
51 |
> |
52 |
> ... snip ... |
53 |
|
54 |
I'm not 100% convinced that it will give you anything meaningful. The |
55 |
warning about ignoring NULL is not so much noise as it is bash warning |
56 |
you that you're probably not doing something correctly. In this case, |
57 |
you're not pulling _all_ the contents of the file: |
58 |
|
59 |
[ /tmp ] |
60 |
oskari@dj3ntoo λ printf "ab\0cd" >test.dat |
61 |
[ /tmp ] |
62 |
oskari@dj3ntoo λ hd test.dat |
63 |
00000000 61 62 00 63 64 |ab.cd| |
64 |
00000005 |
65 |
[ /tmp ] |
66 |
oskari@dj3ntoo λ var=$(< test.dat) |
67 |
bash: warning: command substitution: ignored null byte in input |
68 |
[ /tmp ] |
69 |
oskari@dj3ntoo λ printf "$var" | hd |
70 |
00000000 61 62 63 64 |abcd| |
71 |
00000004 |
72 |
|
73 |
If it's a binary file, there's a decent chance the NULL's are |
74 |
significant. Now, consider the following hypothetical example where we |
75 |
want to remove the NULL's: |
76 |
|
77 |
[ /tmp ] |
78 |
oskari@dj3ntoo λ printf "ab\0cd" | sed -e 's/\x00//' | hd |
79 |
00000000 61 62 63 64 |abcd| |
80 |
00000004 |
81 |
|
82 |
Testing for (in)equality between pre- and post-sed contents is |
83 |
reasonable enough in most cases. This time, though, it would fail to |
84 |
detect anything has changed since the pre-sed contents have their NULL's |
85 |
unintentionally stripped, whereas the post-sed contents have them |
86 |
intentionally stripped. |
87 |
|
88 |
While I personally don't think that running sed on binary files is a |
89 |
good idea in the first place, it's still relevant since the end result |
90 |
would be an incorrect answer to the question of "Did sed actually do |
91 |
anything?". |
92 |
|
93 |
On the other hand, saving a set of pre- and post-sed hashes like Ulrich |
94 |
suggested would give the expected result. |
95 |
|
96 |
- Oskari |