Gentoo Archives: gentoo-user

From: Alan McKinnon <alan@××××××××××××××××.za>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: [OT] Question about duplicate lines in file
Date: Mon, 12 Jun 2006 19:06:35
Message-Id: 200606122039.20722.alan@linuxholdings.co.za
In Reply to: [gentoo-user] Re: [OT] Question about duplicate lines in file by Christer Ekholm
1 On Monday 12 June 2006 19:55, Christer Ekholm wrote:
2 > Teresa and Dale <teendale@×××××××××××××.com> writes:
3 > > Thanks, read the man page, it was short so it didn't take long.
4 > > I tried this:
5 > >
6 > > uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort
7 > >
8 > > It doesn't look like it did anything but copy the same thing
9 > > over. There are only 2 lines missing. Does spaces count? Some
10 > > put in a lot of spaces between the localhost and the web address.
11 > > Maybe that has a affect??
12 >
13 > The problem with uniq is that it (according to the manpage),
14 >
15 > "Discard all but one of successive identical lines"
16 >
17 > You need to have a sorted file for uniq to do what you want, or
18 > sort it with the -u option
19 >
20 > sort -u hosts > hostsort
21 >
22 > If you don't want to ruin your original order you have to do
23 > something else. This is one way of doing it with perl.
24 >
25 > perl -ne 'print unless exists $h{$_}; $h{$_} = 1' hosts >
26 > hostsort
27
28
29 Almost there :-)
30
31 If /etc/hosts has these lines:
32 127.0.0.1 localhost
33 127.0.0.1 localhost
34 uniq will see these as different even though they are actually the
35 same entry. So he needs something like tr to squash spaces. This will
36 do it (as root):
37
38 cat /etc/hosts | tr -s ' ' | sort | uniq -i > /etc/hosts.new
39
40 If the new file is OK, use it to overwrite /etc/hosts
41
42 Explanation so Dale knows what I'm asking him to do:
43 cat send the file to tr
44 tr finds all cases of two or more consecutive spaces and replaces them
45 with one space
46 sort does a sort
47 uniq finds consecutive lines that are the same and throws away the
48 extra ones. The -i is there just in case two entries differ in case
49 only (as FQDNs are strictly speaking case insensitive). As mentioned
50 by others, uniq only matches consecutive dupes, so the list must be
51 sorted first
52 > /etc/hosts.new writes the final output to the named disk file
53
54 Cheers,
55 alan
56
57 p.s. Those 15,000 entries in your hosts file are, um, a lot :-)
58
59
60 --
61 If only me, you and dead people understand hex,
62 how many people understand hex?
63
64 Alan McKinnon
65 alan at linuxholdings dot co dot za
66 +27 82, double three seven, one nine three five
67 --
68 gentoo-user@g.o mailing list

Replies

Subject Author
Re: [gentoo-user] Re: [OT] Question about duplicate lines in file Neil Bothwick <neil@××××××××××.uk>