1 |
Apparently, though unproven, at 01:10 on Tuesday 17 May 2011, Felix Miata did |
2 |
opine thusly: |
3 |
|
4 |
> After attempting to install for the first time last week, I started 3 |
5 |
> different threads here looking for help. I'm pleased with the nature of the |
6 |
> responses, and being able to succeed eventually using a mix of those |
7 |
> responses and my own efforts digging into Google, gentoo.org and cranial |
8 |
> cobwebs. So, thanks to all who replied, and even to those who showed |
9 |
> interest without replying. |
10 |
> |
11 |
> For http://fm.no-ip.com/Tmp/Linux/G/, newly created to use with those three |
12 |
> threads, 'cat /var/log/apache2/access_log | grep "GET /Tmp/Linux/G" | grep |
13 |
> -v <myip> | sort > outfile' generated 117 lines. That's a lot more hits |
14 |
> than I can ever remember getting before when asking for help from a |
15 |
> mailing list (even if it did take 5 days to accumulate so many). |
16 |
> |
17 |
> I'm curious if anyone here would like to offer a better variant of my local |
18 |
> query that would limit the hit count so that no more than one hit per IP is |
19 |
> represented in the output? My skill with such things is very limited. I |
20 |
> can't think of the the name of a command to cut the IP off the front of |
21 |
> each line, much less how to compare if it's a non-first instance to be |
22 |
> discarded. Or, maybe there's an Apache utility for doing this that I just |
23 |
> don't know about? |
24 |
|
25 |
There's always a million ways to skin a cat like this. At a high volume site |
26 |
you would of course not try and deal with this directly from the apache logs. |
27 |
You would send them to syslog which would parse them and write them to a |
28 |
database from where you could run sophisticated SQL. |
29 |
|
30 |
There are also Apache analyser apps out there, google will find them. |
31 |
|
32 |
But I think all that is overkill for what you want. Your command works fine |
33 |
except for needing to discard duplicate IPs. You don't seem to need to know |
34 |
the details of the GET, so just grab using awk the first field and sort | uniq |
35 |
the result. It will run a tad quicker (and reveal less n00bness to your |
36 |
audience) if you grep the file directly instead of cat | grep: |
37 |
|
38 |
grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | \ |
39 |
awk '{print $1}' | sort | uniq | wc |
40 |
|
41 |
In true grand Unix tradition you cannot get quicker, dirtier or more effective |
42 |
than that |
43 |
|
44 |
|
45 |
-- |
46 |
alan dot mckinnon at gmail dot com |