Hotmail vs Live accounts

For many years, we’ve allowed retrieving of email from Hotmail accounts via the Pop Links screen. The system we use to retrieve the email uses the same system that Outlook Express uses (the httpmail protocol based on webdav). A couple of years ago, Microsoft announced that they would be disabling external Outlook Express access. It appears they’re process of actually disabling external access to accounts has been slow and haphazard. Most existing accounts created before the change in policy still work, but most newly created accounts do not.

On top of this, Microsoft seem to now be moving to a new system again. Windows Live Mail Desktop Beta seems to be using another new protocol and service endpoint again (http://mail.services.live.com/DeltaSync_v1.0.0/sync.aspx). It seems all new Live mail accounts and Hotmail accounts are able to access this new protocol, but all access via the old protocol to these accounts has been disabled.

From what I can tell, this new protocol isn’t actually documented anywhere, so I’ve spent some time trying to reverse engineer the new protocol so that we can retrieve email from these accounts. I came quite close and was able to authenticate, retrieve a list of folders and messages and even retrieve each message, but unfortunately the final message data is compressed in some format that I can’t find any documentation or example of. What this means is that I’ve currently reached a dead end and can’t see how we can retrieve hotmail messages using the new protocol :(

For others interested in helping work out what’s needed, here’s a summary of where I’ve got to:

1. You first have to get an authentication token by sending a request to https://login.live.com/RST.srf. An example of this is available here: http://msnpiki.msnfanatic.com/index.php/MSNP13:SOAPTweener. You use the endpoint mail.services.live.com.

2. Once you’ve got the authentication ticket, you use the DeltaSync endpoint. Using the retrieved ticket, you send a POST request to “http://mail.services.live.com/DeltaSync_v1.0.0/Sync.aspx?$ticket” to get a new set of sync tokens.

<?xml version="1.0" encoding="utf-8"?><Sync xmlns="AirSync:" xmlns:A="EMAIL:" xmlns:B="HMMAIL:" xmlns:C="HMFOLDER:" xmlns:D="HMSYNC:"><Collections><Collection><Class>Email</Class><SyncKey>0</SyncKey></Collection><Collection><Class>Folder</Class><SyncKey>0</SyncKey></Collection></Collections></Sync>

The lack of spaces and newlines seems to be important. You may also get a redirect, so follow that first if you do obviously. The returned result will have two SyncKey values, one for the Email class and one for the Folder class, just extract those sync keys.

3. Repeat the request in 2, but replace the “0” values with the SyncKeys returned from the request. You should get a complete list of emails and folders back.

4. You can then retrieve a particular email by sending a POST to “/DeltaSync_v1.0.0/ItemOperations.aspx” (use whatever host you were redirected to) with the following request.

<?xml version="1.0" encoding="utf-8"?><ItemOperations xmlns="ItemOperations:" xmlns:A="HMMAIL:"><Fetch><Class>Email</Class><A:ServerId>$ServerId</A:ServerId><A:Compression>hm-compression</A:Compression></Fetch></ItemOperations>

Replace $ServerId with one of the id’s returned in the list from step 3.

5. The returned result will be a DIME encoded result. Using a DIME parser, you find the payload with an id of “uuid:$ServerId” and get the content data

At this point, it seems we have the email content, but it seems to be compressed in some way, but I don’t know what and haven’t been able to identify it. In the request where we pass “<A:Compression>hm-compression</A:Compression>”, I’ve tried a number of different options. Removing the tags, leaving the text between empty, using the text none or raw, but always just get a fault code in response.

If anyone wants to do some experimentation and work out what’s going on here, email me at robm@fastmail.fm with what you find out.

Of course there are other solutions to accessing hotmail and yahoo accounts that involve screen scraping, but screen scraping techniques are notoriously fragile. Every time Microsoft/Yahoo/etc change their web interface slightly, it can break the screen scraping software. Also there may be differences in language that may not be picked up, so it might work for some users but not others. So it’s a constant battle to keep it up to date and deal with the large number of support problems it can generate when something does break. Because of these reasons, we don’t intend to use screen scraping solutions.

Posted in Technical. Comments Off

Login problems/google search result change on "fastmail"

A few days ago, a number of people started reporting problems logging into FastMail. The symptom was the same, they could get to the FastMail login screen, but entering their username and password always returned a “The user name or password you entered was incorrect. Please try again.” error. We did some checking, and all the accounts reported as affected appeared to be fine.

Well we finally tracked down the problem. It appears that in Firefox, if you just type “fastmail” into the address bar, since it’s not a valid domain name, Firefox does a google search for “fastmail”, takes the first result, and goes to that website. Normally that’s been fine, because if you do a google search for “fastmail”, http://www.fastmail.fm has been the first result. Additionally, if you try and login to the web interface using just the first part of your username (eg. joeblogs) rather than your whole username (eg. joeblogs@fastmail.fm), then we’ll append the domain of the webpage you went to by default. So it seems a number of people with @fastmail.fm accounts, just type “fastmail” into the address bar, and then login using only the first part of their username, the bit before the @ which all worked fine.

Now for some reason a few days ago, google changed their search results. If you do a google search for “fastmail”, they now return http://www.fmail.co.uk as the first result. That’s one of our many domains, so what you see is still the same login screen. However, if you try and login with just the first part of your username, instead of assuming it’s an @fastmail.fm domain account, it’ll assume it’s an @fmail.co.uk account. Of course in most cases the account won’t actually exist, and you’ll get the “The user name or password you entered was incorrect” error.

Now we know that google generally suppresses multiple pages that have the same content, which in the case of our sites is common because http://www.fastmail.fm, http://www.eml.cc, http://www.myfastmail.com, etc all point to our login page. We have no idea why google suddenly decided that http://www.fmail.co.uk was the preferred domain over http://www.fastmail.fm, and no idea how to convince them to change back either.

In the meantime, we’ll make it so that when a login fails, it now shows:

“The user name (joeblogs@fastmail.fm) or password you entered was incorrect. Please try again.”

Which should make it quickly obvious when there’s a domain mismatch problem.

For people that are used to typing “fastmail” in the address bar to login, we recommend just typing “fastmail.fm” instead. It’s only 3 extra characters, and will correctly take you to http://www.fastmail.fm where you can still use just the first part of your login name.

Posted in Technical. Comments Off

X-Spam-hits header has spam scores added

FastMail has for many years added an “X-Spam-hits” header to show which SpamAssassin rules were triggered by an email. Unfortunately previously finding the scores of each of those hits involved looking up a table at the spamassassin website. Now those scores have been added directly to the X-Spam-hits header immediately after each hit. So a header like this:

X-Spam-hits: BAYES_99 3.5, EXTRA_MPART_TYPE 1.091, HTML_MESSAGE 0.001, SPAMMY_XMAILER 1
X-Spam-score: 5.5

Shows that BAYES_99 had a score of 3.5, EXTRA_MPART_TYPE a score of 1.091, etc. Adding these all up gives the final score of 5.5 (always rounded to 1 decimal place).

Posted in Technical. Comments Off

FTP Server limited to 1 connection per user

We’ve had a couple of users doing reasonable things which just happen to hit pathological cases in our filesystem implementation.  To stop this happening, we’ve had to restrict the FTP server to one connection per user.
Hopefully we can remove this limitation again soon, but for now it’s needed to ensure that service keeps working smoothly for everyone else.

Posted in Technical. Comments Off

PDF XSS attack protection

I’ve just rolled out some checks to help protect our users from a particular family of XSS attacks via links to PDF files. If you’re viewing an HTML message that contains one of these links via the web interface, then the Phishing Protection will disable the link with a warning. URLs of this form that appear in a text message will not be converted to a clickable link.

This should reduce the likelihood of users being compromised by such links sent to them in email messages.

For more information, see this forum thread.

Posted in Technical. Comments Off

Web/IMAP/POP frontend proxies changed to nginx

A while back, we changed our frontend IMAP/POP proxy from perdition to nginx. Perdition uses a traditional unix “one process per connection” model to manage the proxying of IMAP/POP requests. Because of the long lived nature of IMAP connections, perdition was using over 8,000 processes on each of our frontend machines. Even with Linux 2.6 and the O(1) scheduler, the machines were beginning to struggle with the large number of processes, creating a sluggish feeling to IMAP connections.

Instead of a process per connection, nginx uses a small fixed process pool and non-blocking code with epoll (on linux) to provide much higher scalability. At the time we first looked at nginx, it only supported HTTP proxying, but we realised the underlying architecture would be a good one for IMAP/POP proxying as well. With that in mind, we contacted the author of nginx (Igor Sysoev who we were already familiar with due to mod_accel) to implement an IMAP/POP proxy in nginx. We agreed to pay him for this, and to allow the code to be included in the regular nginx distribution. Over the next couple of months he implemented it, and after some testing and bug fixing, we rolled it out to our frontend proxy servers in Sep 2005. The results for us were dramatic. Load on our frontend servers dropped dramatically, and IMAP/POP responsiveness improved noticeably.

Because web connections aren’t as long lived as IMAP connections, we stayed with Apache for our frontends for a while longer. However we’ve now switched over to using nginx for our frontend web proxy as well, which has also allowed us to increase the keep-alive timeout for HTTP connections to 5 minutes, which should result in a small perceptible improvement when moving between pages.

The net result of all this is that each frontend proxy server currently maintains over 10,000 simultaneous IMAP, POP, Web & SMTP connections (including many SSL ones) using only about 10% of the available CPU on 3.20GHz Netburst Xeon based CPUs.

Posted in Technical. Comments Off

Actions performed on intersection of searched & view & selected

As mentioned in this forum post, I’ve made a change on the beta server in the way messages are actioned on the mailbox screen.

On the regular server, when you select an action on the mailbox screen and click “Do”, it’s performed on ALL selected messages in the folder, even if some selected messages aren’t currently visible because of the current view or search.

Currently on the beta server, when you select an action on the mailbox screen and click “Do”, it’s performed on the intersection of selected messages, viewed messages (eg the All / Unread / Read / Flagged / Selected), and any search criteria currently in operation.

This has been discussed previously, and I think this is quite a good idea, especially because the current approach seems to result in unexpected messages being deleted.

There are one or two slightly odd side effects of this change as mentioned in the forum post that I’m looking for some feedback on.

Update: This change has now been rolled out to all production servers

Posted in Technical. Comments Off
Follow

Get every new post delivered to your Inbox.

Join 5,809 other followers