Dynamic Content is not your friend: Scaling & the Front-side Cache

Before I get started on the economics of ad delivery, I’d like to talk about something… the front end proxy and cache modules.

I’m sure you know about them — they are the components to the system that give your dynamic web site blazing speed. Examples are Varnish, Squid, Google ncache, and mod_cache. But I would submit to you, my anticipatory readers that if you are using some front end cache to speed up your backend (or frontend, if you have no cache) server(s), then you are simply using the wrong tool for the job.

I’ve said it, and I’ll say it again: dynamic web sites are NOT YOUR FRIEND! Sure they are great when you have one visitor per day and she is your mother, but believe it or not, if you have thousands of visitors and you are serving up dynamic html content, you have missed the point. It is often said that 99% of content delivered is not dynamic — its images, css, js and the like. But I say to you that 99.9999% of your content should be STATIC as well! And that includes the html pages. You should not be relying on a cache to make dynamic content static; you should have a single dynamic server (if that), and push purely static copies of your content out to your mirrors. Your “dynamic backend” should never get hit. Then your frontend cache should be taking care of serving cache copies of already static content. Apache should not be sitting around with mod_php or mod_perl bloatware, composing dynamic pages, hitting the database, or hitting memcached, whilst trying to decide what dynamic content to keep in the cache vs expire from it. Let the cache do its job — don’t fight it!

My point is, people are using the frontside cache, along with load balancers to solve an already poorly engineered system. You want to serve blazingly fast content? Make hot mirrors (servers) of your content and periodically push new versions out to them. Don’t let apache (or whatever your webserver of choice) know or care anything about what it is serving. Put the cache+loadbalancer in front of that. Or if you are really delivering lots of content, get servers around the country and set up a CDN for your content.