Gentoo Archives: gentoo-user

From: Michael Orlitzky <mjo@g.o>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] script help - removing newlines
Date: Wed, 08 Dec 2021 12:51:48
Message-Id: YbCqUSP1XNRzHaN4@stitch
In Reply to: [gentoo-user] script help - removing newlines by Adam Carter
1 On 2021-12-08 17:15:43, Adam Carter wrote:
2 >
3 > but sometimes there are newline characters in the comment field;
4 >
5 > property "something"
6 >
7 > comment "something
8 >
9 > something else
10 >
11 > a third thing"
12 >
13 > I want to replace any newlines between 'comment "' and the next '"' with
14 > spaces so the whole comment is on a single line. How can it be done?
15
16 It depends on how complicated the format of your file can be. For the
17 one example shown, you could write a python script that looks for,
18
19 comment "
20
21 at the beginning of a line, and then scans forward one character at a
22 time, looking for the ending quotation mark, but deleting any newlines
23 it finds along the way. Things like this work great until they meet
24 the real world:
25
26 1. Are you sure there's always exactly one space between the word
27 'comment' and the quotation mark?
28
29 2. Can there be space before the word 'comment'?
30
31 3. Are you sure nobody is using single quotes instead of double quotes?
32 How about the fancy non-ascii quotes that you get sometimes when
33 copy/pasting from a webpage or a Word document?
34
35 4. What happens if a comment contains double-quotes?
36
37 etc. If you're the one creating the data or if you're sure that the
38 format will be exactly what you say it is, then you can get away with
39 a simple script. (Otherwise, the answer is basically "write a parser.")