Re: [gentoo-dev] Anti-spam changes: proposal to drop spammy mail - gentoo-dev

From:	Niels Dettenbach <nd@××××××××.com>
To:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] Anti-spam changes: proposal to drop spammy mail
Date:	Tue, 12 May 2015 07:18:34
Message-Id:	`2025350.DvjFDXzWaF@gongo`
In Reply to:	Re: [gentoo-dev] Anti-spam changes: proposal to drop spammy mail by "Robin H. Johnson"

1

Am Montag, 11. Mai 2015, 20:36:18 schrieb Robin H. Johnson:

2

> There are people that still accept mail that violates standards?

3

yes,

4

and there are mail sites and/or mail clients sending standard violating emails.

5

6

But the more truth is that there are many points within standards which are interpreted differently from different peoples / groups (or even mailer software developers) and there is no real clear / hard "border" what is a violence (and "could be dropped") and what is not - at least if you did not want to loose ham traffic for your users.

7

8

The email oecosystem does not dpend from a single RFC today - more and more basic parts of existing internet mail and it's features are defined in further RFCs or are conclusive from each other.

9

10

Two very typical examples:

11

12

1.) The sender domain has no MX nor abuse contact (i.e. RFC 2142)

13

Many pro level mass mailers do not have an "working" abuse contact, but there are still many smaller sites out which doesnt have too (because of limited DNS access or lack of knowledge). Dropping mail from such sites will lead you to loosing mails (even if it "just" hits one in thousand ham mails).

14

15

2.) BCC Header

16

Most Mailers today are filtering out BCC recipient headers at some point while this is not defined in the RFCs and still discussed hardly how far the deletion of BCC headers are breaking standards, resulting in possible lost of emails. See i.e. Phillip Hazels (EXIMs) statements in the net.

17

18

19

> My above statement is for mail that we ACCEPTED. If it violates

20

> standards, it's already denied at SMTP time.

21

hmmm,

22

you mean some more to very basic points of the standards.

23

24

> smtpd_restriction_classes = restrictive,permissive

25

> restrictive =

26

>     reject_invalid_hostname

27

>     reject_non_fqdn_hostname

28

>     reject_non_fqdn_recipient

29

>     reject_non_fqdn_sender

30

>     reject_unknown_sender_domain

31

>     reject_unknown_recipient_domain

32

>     check_sender_mx_access cidr:/etc/postfix/bogus_mx_records

33

>     check_sender_access pcre:/etc/postfix/sender_access_control.pcre

34

>     check_sender_access pcre:/etc/postfix/sender_access_control-aliases.pcre

35

> check_helo_access pcre:/etc/postfix/helo_checks

36

>     reject_unverified_sender

37

>     check_client_access cidr:/etc/postfix/filter.cidr

38

>     permit

39

> permissive =

40

>     permit

41

42

43

If it helps you a bit further, i can explain the basics of our setup, developed over nearly 20 years now, handling just a few hundredthousands smtp sessions per day and having NO spam folder or similiar (which would not save any time of the email user at the end) - but easily could be ad[a|o]pted per i.e. SIEVE to lead out less hard / more "unclear" spam into folders (i.e. instead of that mail where we make greylisting usually).

44

45

Because sendmail and postfix was to ressource inefficient for us sometimes in the early stages, we decided to go to EXIM (Phillip Hazel) - an own build optimized for our needs - including even some own mods today.

46

47

We avoided running SA from Amavis because of inefficiency.

48

49

Until today our incoming path goes:

50

51

 - EXIM with EXIM SA at SMTP time

52

53

Means we use Spamassassin directly at SMTP time, which allows us to dynamically "react" or further actively investigate a incoming smtp session if required. SA is only invoked for non authenticated mail over network btw..

54

55

Before exim contacts spamassassin at this stage, we run a bunch of checks in EXIM similiar to yours above, but some more (see down) which drop the connection or write data for further processing into the headers. If the connection is still alive, we run a hand full of RDNSBL checks, which "could reject" the session and then a hand full which just writes warnings into headers plus data for further processing steps.

56

57

If the sessions still "lives", EXIM contacts Spamassassin over socket and 

58

59

Here we have 3 "routes":

60

61

	- low spam -> Mails is going trough DIRECTLY

62

more then SA 2.3/3.0 - possible spam -> Greylisting (3 times TEMP Reject)

63

more then SA 5.2 - spam -> REJECT

64

more then SA 33.0 - blackhole

65

66

-> This kind of REJECT hits around 5 - 10% of spam connections, all other spam is usually catched before without the full email / mail body recieved.

67

68

Greylisting is "remembering" each contact<->contact handle and "quasi whitelists" the sender email after greylisting once to avoid further delays in the future - this helps very well for mail sites and/or clients which uses mail systems with bad reputation while "working OK".

69

70

SA EXIM is able to do teergrubing as well, but we did not use it in most situations - except partly in dictionary attacks.

71

72

At this point, parts of the mail traffic is going to an AMAVIS-NG for virus filtering only (user decide for it byself here) - no SA or RDNSBL again / at this place.

73

74

EXIM SA is no longer maintained officially, so we maintain it byself into actual EXIM source trees (would be nice to get it into Gentoos EXIM ebuild - i.e. by a USE flag - would help here if someone is interested - and if someone has a newer, at least same efficient solution it would be nioce to know).

Overview of EXIM checks of incoming SMTP sessions (parts of this are implemented in your postfix rules too):

79

--- snip ---

80

81

= HELO/EHLO required by SMTP RFC  See http://www.syndicat.com/faq/email/no_helo/

82

= Forged IP detected in HELO (it's mine) - $sender_helo_name  See http://www.syndicat.com/faq/email/forged_ip/

83

= Forged IP detected in HELO: $sender_helo_name

84

= Forged IP detected in HELO - $sender_helo_name != $sender_host_address  See http://www.syndicat.com/faq/email/forged_ip/

85

= Forged hostname detected in HELO - you are not $sender_helo_name See http://www.syndicat.com/faq/email/forged_ip/

86

= HELO is our IP

87

= $sender_helo_name is a silly HELO.

88

= RFC 1918 IP address in HELO ( See http://www.syndicat.com/faq/email/rfc1918-helo/ )

89

= $sender_address_domain is a silly domain. (i.e. localhost)

90

= HELO should be hostname but is $sender_helo_name . ( See http://www.syndicat.com/faq/email/helo_nohostname/ )

91

= HELO should be Fully Qualified Domain Name Host.Domain.Tld ( See RFC821 or http://www.syndicat.com/faq/email/helo_nofqdn/ )

92

= Forged hostname detected in HELO - $sender_helo_name is one of our domains

93

= Only one recipient accepted for NULL sender

94

= (DROP) too many unknown users (${eval:$rcpt_fail_count+1} failed recipients)

95

= Dictionary attack (${eval:$rcpt_fail_count+1} failed recipients).

96

=> Teergrube: dictionary attack (${eval:$rcpt_fail_count+1} failed recipients)

97

= unknown user

98

= X-Broken-Reverse-DNS: no DNS for IP address $sender_host_address

99

= acl_mail: (WARN-ONLY) Cannot reverse DNS $sender_host_address

100

= X-Broken-Reverse-DNS: no DNS for IP address $sender_host_address

101

= Content Policy Restriction: Mails to undisclosed recipients are not permitted.

102

= No contact MX - rfc-ignorant host $sender_host_name $sender_host_address . ( See http://www.syndicat.com/faq/email/rfc_ignorant/ )

103

= (WARN-ONLY, no reliable check possible) No MX abuse contact - rfc-ignorant host $sender_host_name $sender_host_address . ( See http://www.syndicat.com/faq/email/rfc_ignorant/ )

104

--- snap ---

105

106

then

107

108

RDNSBL:

109

110

deny = sbl-xbl.spamhaus.org : cbl.abuseat.org : zen.spamhaus.org : b.barracudacentral.org : psbl.surriel.com : ix.dnsbl.manitu.net

111

warn = dnsbl-3.uceprotect.net : ubl.unsubscore.com : dnsbl-1.uceprotect.net : dnsbl.sorbs.net

112

113

114

hth a bit.

115

116

117

cheerioh,

118

119

120

Niels.

121

--

122

---

123

 Niels Dettenbach

124

 Syndicat IT & Internet

125

 http://www.syndicat.com

126

 PGP: https://syndicat.com/pub_key.asc

127

---

Gentoo Archives: gentoo-dev

Attachments

1	Am Montag, 11. Mai 2015, 20:36:18 schrieb Robin H. Johnson:
2	> There are people that still accept mail that violates standards?
3	yes,
4	and there are mail sites and/or mail clients sending standard violating emails.
5
6	But the more truth is that there are many points within standards which are interpreted differently from different peoples / groups (or even mailer software developers) and there is no real clear / hard "border" what is a violence (and "could be dropped") and what is not - at least if you did not want to loose ham traffic for your users.
7
8	The email oecosystem does not dpend from a single RFC today - more and more basic parts of existing internet mail and it's features are defined in further RFCs or are conclusive from each other.
9
10	Two very typical examples:
11
12	1.) The sender domain has no MX nor abuse contact (i.e. RFC 2142)
13	Many pro level mass mailers do not have an "working" abuse contact, but there are still many smaller sites out which doesnt have too (because of limited DNS access or lack of knowledge). Dropping mail from such sites will lead you to loosing mails (even if it "just" hits one in thousand ham mails).
14
15	2.) BCC Header
16	Most Mailers today are filtering out BCC recipient headers at some point while this is not defined in the RFCs and still discussed hardly how far the deletion of BCC headers are breaking standards, resulting in possible lost of emails. See i.e. Phillip Hazels (EXIMs) statements in the net.
17
18
19	> My above statement is for mail that we ACCEPTED. If it violates
20	> standards, it's already denied at SMTP time.
21	hmmm,
22	you mean some more to very basic points of the standards.
23
24	> smtpd_restriction_classes = restrictive,permissive
25	> restrictive =
26	> reject_invalid_hostname
27	> reject_non_fqdn_hostname
28	> reject_non_fqdn_recipient
29	> reject_non_fqdn_sender
30	> reject_unknown_sender_domain
31	> reject_unknown_recipient_domain
32	> check_sender_mx_access cidr:/etc/postfix/bogus_mx_records
33	> check_sender_access pcre:/etc/postfix/sender_access_control.pcre
34	> check_sender_access pcre:/etc/postfix/sender_access_control-aliases.pcre
35	> check_helo_access pcre:/etc/postfix/helo_checks
36	> reject_unverified_sender
37	> check_client_access cidr:/etc/postfix/filter.cidr
38	> permit
39	> permissive =
40	> permit
41
42
43	If it helps you a bit further, i can explain the basics of our setup, developed over nearly 20 years now, handling just a few hundredthousands smtp sessions per day and having NO spam folder or similiar (which would not save any time of the email user at the end) - but easily could be ad[a\|o]pted per i.e. SIEVE to lead out less hard / more "unclear" spam into folders (i.e. instead of that mail where we make greylisting usually).
44
45	Because sendmail and postfix was to ressource inefficient for us sometimes in the early stages, we decided to go to EXIM (Phillip Hazel) - an own build optimized for our needs - including even some own mods today.
46
47	We avoided running SA from Amavis because of inefficiency.
48
49	Until today our incoming path goes:
50
51	- EXIM with EXIM SA at SMTP time
52
53	Means we use Spamassassin directly at SMTP time, which allows us to dynamically "react" or further actively investigate a incoming smtp session if required. SA is only invoked for non authenticated mail over network btw..
54
55	Before exim contacts spamassassin at this stage, we run a bunch of checks in EXIM similiar to yours above, but some more (see down) which drop the connection or write data for further processing into the headers. If the connection is still alive, we run a hand full of RDNSBL checks, which "could reject" the session and then a hand full which just writes warnings into headers plus data for further processing steps.
56
57	If the sessions still "lives", EXIM contacts Spamassassin over socket and
58
59	Here we have 3 "routes":
60
61	- low spam -> Mails is going trough DIRECTLY
62	more then SA 2.3/3.0 - possible spam -> Greylisting (3 times TEMP Reject)
63	more then SA 5.2 - spam -> REJECT
64	more then SA 33.0 - blackhole
65
66	-> This kind of REJECT hits around 5 - 10% of spam connections, all other spam is usually catched before without the full email / mail body recieved.
67
68	Greylisting is "remembering" each contact<->contact handle and "quasi whitelists" the sender email after greylisting once to avoid further delays in the future - this helps very well for mail sites and/or clients which uses mail systems with bad reputation while "working OK".
69
70	SA EXIM is able to do teergrubing as well, but we did not use it in most situations - except partly in dictionary attacks.
71
72	At this point, parts of the mail traffic is going to an AMAVIS-NG for virus filtering only (user decide for it byself here) - no SA or RDNSBL again / at this place.
73
74	EXIM SA is no longer maintained officially, so we maintain it byself into actual EXIM source trees (would be nice to get it into Gentoos EXIM ebuild - i.e. by a USE flag - would help here if someone is interested - and if someone has a newer, at least same efficient solution it would be nioce to know).
75
76
77
78	Overview of EXIM checks of incoming SMTP sessions (parts of this are implemented in your postfix rules too):
79	--- snip ---
80
81	= HELO/EHLO required by SMTP RFC See http://www.syndicat.com/faq/email/no_helo/
82	= Forged IP detected in HELO (it's mine) - $sender_helo_name See http://www.syndicat.com/faq/email/forged_ip/
83	= Forged IP detected in HELO: $sender_helo_name
84	= Forged IP detected in HELO - $sender_helo_name != $sender_host_address See http://www.syndicat.com/faq/email/forged_ip/
85	= Forged hostname detected in HELO - you are not $sender_helo_name See http://www.syndicat.com/faq/email/forged_ip/
86	= HELO is our IP
87	= $sender_helo_name is a silly HELO.
88	= RFC 1918 IP address in HELO ( See http://www.syndicat.com/faq/email/rfc1918-helo/ )
89	= $sender_address_domain is a silly domain. (i.e. localhost)
90	= HELO should be hostname but is $sender_helo_name . ( See http://www.syndicat.com/faq/email/helo_nohostname/ )
91	= HELO should be Fully Qualified Domain Name Host.Domain.Tld ( See RFC821 or http://www.syndicat.com/faq/email/helo_nofqdn/ )
92	= Forged hostname detected in HELO - $sender_helo_name is one of our domains
93	= Only one recipient accepted for NULL sender
94	= (DROP) too many unknown users (${eval:$rcpt_fail_count+1} failed recipients)
95	= Dictionary attack (${eval:$rcpt_fail_count+1} failed recipients).
96	=> Teergrube: dictionary attack (${eval:$rcpt_fail_count+1} failed recipients)
97	= unknown user
98	= X-Broken-Reverse-DNS: no DNS for IP address $sender_host_address
99	= acl_mail: (WARN-ONLY) Cannot reverse DNS $sender_host_address
100	= X-Broken-Reverse-DNS: no DNS for IP address $sender_host_address
101	= Content Policy Restriction: Mails to undisclosed recipients are not permitted.
102	= No contact MX - rfc-ignorant host $sender_host_name $sender_host_address . ( See http://www.syndicat.com/faq/email/rfc_ignorant/ )
103	= (WARN-ONLY, no reliable check possible) No MX abuse contact - rfc-ignorant host $sender_host_name $sender_host_address . ( See http://www.syndicat.com/faq/email/rfc_ignorant/ )
104	--- snap ---
105
106	then
107
108	RDNSBL:
109
110	deny = sbl-xbl.spamhaus.org : cbl.abuseat.org : zen.spamhaus.org : b.barracudacentral.org : psbl.surriel.com : ix.dnsbl.manitu.net
111	warn = dnsbl-3.uceprotect.net : ubl.unsubscore.com : dnsbl-1.uceprotect.net : dnsbl.sorbs.net
112
113
114	hth a bit.
115
116
117	cheerioh,
118
119
120	Niels.
121	--
122	---
123	Niels Dettenbach
124	Syndicat IT & Internet
125	http://www.syndicat.com
126	PGP: https://syndicat.com/pub_key.asc
127	---