I thought I’d take some of the things posted in the forum threads here and here and put them into one blog post.
There were actualy two separate policies implemented at the same time: greylisting and “address enumeration detection” (We’ll call it AED). Greylisting is a method designed to stop spam being accepted from the large number of zombie computers that are connected to the internet. AED is designed to stop other people trying to find what addresses are valid email addresses at FastMail.
One of the main concerns with greylisting is that naive implementations will often delay all email. In our implementation we’ve gone to great lengths to ensure that this doesn’t happen.
- We only greylisting hosts that appear to be dialup/dsl hosts of some sort, or hosts that don’t have any valid reverse DNS. This ensures that the vast majority of email servers are immediately not subject to greylisting, and their email is not delayed
- If a host has been greylisted, and it successfully passes greylisting twice in a 24 hours period (e.g. it correctly attempts to re-deliver a piece of email twice in 24 hours), then that host is whitelisted and not subject to greylisting for the next 24 hours. If it continues to deliver emails, each new delivery will extend the whitelist period. This means that any real email servers (a real email server will always retry) connected via dialup/dsl will quickly be whitelisted and not subject to email delays
- If a host opens an SMTP session with a HELO that is not an IP address, is not the same reverse DNS as it’s connecting IP, but the forward DNS of the name does resolve to the connecting IP, then that host is not subject to greylisting. (As suggested by hadaso on the forum)An example: The machine at IP 220.127.116.11 connects to us. The reverse DNS for 18.104.22.168 is 206-223-169-73.beanfield.net, which looks like a common dialup/dsl IP name, and would be a candidate for greylisting. However, the machine advertises itself to us with a “HELO mx3.hub.org” line. Doing a forward lookup of mx3.hub.org gives the IP 22.214.171.124, which is the same as the connecting IP, so we exclude it from greylisting.
When combined, these features provide an excellent balance of greylisting hosts which should not be sending email, and allowing those hosts which should be sending email to get their email straight through.
Additionally, to help with the tracking of any problems, once a message passes greylisting and is accepted, a new header is added “X-Spam-greylist”. This header tells you how many seconds the email was delayed and whether that host has been whitelisted for 24 hours. (Technical: Well, actually the delay figure is the how long the last delay for the ip/sender/recipient combination was, so in the case of multiple emails from the same person, to the same person, from the same machine in a short time period, the figure will be a bit messy and hard to calculate).
All up, we now have 4 lines of defense against spam at the moment:
- RBLs (dsbl/xbl) – all users
- Greylisting – all users
- SpamAssassin – full/enhanced users
- Backscatter detection – full/enhanced users
The combination of these 4 things provides an extremely strong defense against spam with absolutely no user interaction at this point. We also hope later this year to add per-user bayes databases, which will allow per-user training of a statistical database to catch the final spams that make it through all these filters.