1 |
> The best tool for this is the pf packet filter, but it runs on FreeBSD. |
2 |
|
3 |
It's too bad this still isn't around.. |
4 |
|
5 |
|
6 |
https://wiki.gentoo.org/wiki/Gentoo_FreeBSD |
7 |
|
8 |
|
9 |
On Wed, Oct 4, 2017 at 11:21 AM, Alan McKinnon <alan.mckinnon@×××××.com> |
10 |
wrote: |
11 |
|
12 |
> On 04/10/2017 07:28, Walter Dnes wrote: |
13 |
> > I have some doubts about massive "hosts" files for adblocking. I |
14 |
> > downloaded one that listed 13,148 sites. I fed them through a script |
15 |
> > that called "host" for each entry, and saved the output to a text file. |
16 |
> > The result was 1,059 addresses. Note that some adservers have multiple |
17 |
> > IP address entries for the same name. A back-of-the-envelope analysis |
18 |
> > is that close to 95% of the entries in the large host file are invalid, |
19 |
> > amd return "not found: 3(NXDOMAIN)". |
20 |
> > |
21 |
> > I'm not here to trash the people compiling the lists; the problem is |
22 |
> > that hosts files are the wrong tool for the job. Advertisers know about |
23 |
> > hosts files and deliberately generate random subdomain names with short |
24 |
> > lifetimes to invalidate the hosts files. Every week the sites are |
25 |
> > probably mostly renamed. Further analysis of the 1,059 addresses show |
26 |
> > 810 unique entries, i.e. 249 duplicates. It gets even better. 44 |
27 |
> > addresses show up in 52.84.146.xxx; I should probably block the entire |
28 |
> > /24 with one entry. There are multiple similar occurences, which could |
29 |
> > be aggregated into small CIDRs. So the number of blocking rules is |
30 |
> > greatly reduced. |
31 |
> > |
32 |
> > I'm not a deep networking expert. My question is whether I'm better |
33 |
> > off adding iptables reject/drop rules or "reject routes", e.g... |
34 |
> > |
35 |
> > route add -net 10.0.0.0 netmask 255.0.0.0 metric 1024 reject |
36 |
> > |
37 |
> > (an example from the "route" man page). iptables rules have to be |
38 |
> > duplicated coming and going to catch inbound and outbound traffic. A |
39 |
> > reject route only needs to be entered once. This excercise is intended |
40 |
> > to block web adservers, so another question is how web browsers react to |
41 |
> > route versus iptables blocking. |
42 |
> > |
43 |
> > While I'm at it (I did say I'm not an expert) is there another way to |
44 |
> > handle this? E.g. redirect "blocked CIDRs" via iptables or route to a |
45 |
> > local pixel image? Will that produce an immediate response by the web |
46 |
> > browser, versus timing out with "regular blocking"? |
47 |
> > |
48 |
> |
49 |
> |
50 |
> This is a complex problem with no cut-and-dried solution. It's real life |
51 |
> and as you know real life is murky. |
52 |
> |
53 |
> Let's define the real problem you are wanting to solve: there's a bunch |
54 |
> of ad servers out there, and you want them to disappear. Or more |
55 |
> accurately, you want their traffic to disappear from *your* wires. |
56 |
> |
57 |
> There are really 3 approaches as you know: |
58 |
> redefine the hostname to be a blackhole (e.g. 127.0.0.1) |
59 |
> find the addresses or subnets and drop/reject the packets with iptables |
60 |
> find the subnets (sometimes the individual hosts) and route them into a |
61 |
> blackhole |
62 |
> |
63 |
> Each has their strengths and weaknesses. |
64 |
> packet filters work best at the TCP/UDP/ICMP layer where you have an |
65 |
> addresses and often a port. |
66 |
> routing works best at the IP layer where you have whole chunks of |
67 |
> subnets and tell the router what to do with all traffic from that entire |
68 |
> subnet |
69 |
> host files work best at the name layer where you have dns names |
70 |
> |
71 |
> Your problem seems to slot in somewhere between a firewall and a routing |
72 |
> solution, explaining why you can't decide. Host files for this sucks |
73 |
> major big eggs as you know, people still use it as it seems legit (but |
74 |
> isn't) and they understand it whereas they don't understand the other 2. |
75 |
> |
76 |
> Ad providers are well aware of this. I was surprised to see |
77 |
> 52.84.146.0/24 show up in your mail, as that is Amazon's AWS range. Yes, |
78 |
> you could null-route that subnet, but it's Amazon and maybe there's |
79 |
> hosts in there that you DO want to use. |
80 |
> |
81 |
> I'd suggest you use a packet filter, but not on Linux and certainly not |
82 |
> iptables. That thing is a god-awful mess looking like it was built by |
83 |
> unsupervised schoolkids masquerading as internes. The best tool for this |
84 |
> is the pf packet filter, but it runs on FreeBSD. Get yourself a spare |
85 |
> machine, load pfsense on it (it's an appliance like wrt) and drop the |
86 |
> traffic from all offensive addresses. Drop, not reject. |
87 |
> |
88 |
> You could in theory do the same thing with iptables, but the ruleset |
89 |
> will quickly drive you nuts. Perhaps the ipset plugin would help, I've |
90 |
> been meaning to check it out for ages and never got around to it. |
91 |
> |
92 |
> |
93 |
> -- |
94 |
> Alan McKinnon |
95 |
> alan.mckinnon@×××××.com |
96 |
> |
97 |
> |
98 |
> |
99 |
|
100 |
|
101 |
-- |
102 |
Regards, |
103 |
|
104 |
[image: Visit online journal] <https://lramage94.github.io/> |
105 |
|
106 |
*Lucas Ramage* / Software Engineer |
107 |
ramage.lucas@×××××××××××.org / (941) 404-6794 |
108 |
|
109 |
*PGP Fingerprint* / Learn More <https://emailselfdefense.fsf.org/en/> |
110 |
EAE7 45DF 818D 4948 DDA7 0F44 F52A 5A96 7B9B 6FB7 |
111 |
<https://pgp.mit.edu/pks/lookup?op=get&search=0xF52A5A967B9B6FB7> |
112 |
|
113 |
*Visit online journal* |
114 |
http://lramage94.github.io <https://lramage94.github.io/> |
115 |
|
116 |
[image: Github] <https://github.com/lramage94>[image: Linkedin] |
117 |
<https://www.linkedin.com/in/lramage94> |