Tagdef.com is a website for looking up and defining hashtags. Think of it as Urban Dictionary for hashtags.
Speed is a feature
When designing and building Tagdef.com, speed has been one of the key focuses. A faster website makes the users happy, and a website that can handle a decent peak in traffic makes the owner sleep well at night.
Both Google and Amazon have done studies which indicate that the user-experience is directly linked to the speed of the page:
After a bit of looking, Marissa explained that they found an uncontrolled variable. The page with 10 results took .4 seconds to generate. The page with 30 results took .9 seconds.
Half a second delay caused a 20% drop in traffic. Half a second delay killed user satisfaction.
Even though studies show that most of the time is spent rendering the page in the users browser, a snappy server helps. And if the server spends less time processing a request, it can process more requests per second on the same hardware (all things equal).
There are several elements on a typical tagdef-page that goes stale quickly:
- The relative time of the definition (“added 13 minutes ago”)
- The list of recent tweets for this hashtag
- Related tags
- The actual content of the definitions.
Because of this, I’ve chosen to have a short time-to-live for dynamic pages in Varnish, typically 10-20 minutes. As a result, the long tail of seldom visited pages will expire from the Varnish cache. But that’s OK. The main purpose of Varnish in this setup is to cache the most popular pages. If the same page is hit many times in a short time-period, most of the requests will be taken care of by Varnish. This is a typical scenario when a hashtag is trending, or every Friday when people search for #ff.
Let’s say the requested definition-page was not in the Varnish cache. Varnish asks Apache for it, and good old php-code is executed. When the php-code needs some dynamic data (which means most of the content on Tagdef), it first asks a memcache-server for it. A request to MySQL is only done if the content is not found in Memcache. Some of the content in the memcache-server expires automatically, other content is only thrown out if the content is modified.
If a user adds a definition, she expects to see that definition when returning to the page. This would not work if Varnish was simply returning the old page from memory. So when a user adds a definition, Varnish (and Memcache) is told to throw out the old content for that page.
Refresh the cache asynchronously
Most of the MySQL-requests are fast (less than 50ms), and the page should still feel pretty fast even if no cache is used. But for some content, the requests take more time. Content like this is requested and updated in the cache regularly by background processes, so the users should always hit the cache and get a fast response.
Tagdef uses data from Twitter to show recent tweets and trends, and Google Translate to translate any non-English definitions. All communication with these external APIs is done asynchronously. This means that Tagdef will work even if Twitter is down, but some features will be missing. The data is cached in Memcache, so we don’t abuse the external APIs by asking over and over for the same information. This also makes the external content load instantly, if it is found in the cache. If not, it is loaded asynchronously in the browser (Ajax).
Statistics are fun. But if I were to log the performance of every request, that might itself have messed up the performance. And besides, Google is much better at presenting numbers than I am. So I use Google Analytics to track time spent generating the page.
Time spent generating the front page, in milliseconds.
I’ve spend some time profiling the PHP code, and removed some bottlenecks. Among other things, I discovered that I spent a significant amount of time connecting to the MySQL server, even though I didn’t need MySQL in every request. I made the code a bit lazy, so it only connects to MySQL when it needs to. As a bonus, the most frequently used parts of the website will work as normal, even if the MySQL server is down for a couple of minutes.
It is popular to say that premature optimization is the root of all evil. Most of the optimization done based on the profiling might be well into premature optimization land. And could I have done without Memcache and Varnish? Probably. Certainly for 99% of the time. But the page would have been a bit slower for everyone. And Google indexing tens of thousands of pages a day might have caused a problem. Not to mention the horror of realizing that your site crashed when it finally got some major publicity.
And besides, it’s fun to see the front-page generated in 3.2 milliseconds (and fully load in 600 milliseconds). I’m an engineer, because I can is a valid reason for me. And when that makes the user-experience better, the server-bill smaller, and the site stay online, all the better.