This is a technical post. Regular Fastmail users subscribed to receive email updates from the Fastmail blog can just ignore this post.
So over the last couple of weeks we noticed that our new IMAP servers with 48G of RAM haven't been performing as well as expected, and there were some oddities. Namely two things stuck out:
After doing some searching, we found this thread in the Linux kernel mailing list.
It appears that patch never went anywhere, and zone_reclaim_mode is still defaulting to 1 on our pretty standard file/email/web server type machine with a NUMA kernel.
By changing it to 0, we saw an immediate massive change in caching behaviour. Now cache ~ 27G, buffers ~ 7G and unused ~ 0.2G, and IO reads from the SSD dropped to 100/s instead of 2000/s.
So if you’re using newer AMD/Intel processors with a NUMA kernel in a web server/file server/email server setup, you should make sure you set /proc/sys/vm/zone_reclaim_mode to 0. I’ve posted to the LKML about this, but haven’t heard anything, so I have no idea if anyone regards this default value as a bug or not.