Q&A: The adventures of scaling, Stage 1 March 18th

My initial article about the scaling journey we went through with eins.de generated quite some buzz in the Rails niche of the web. A lot more than I had anticipated.

A number of questions have popped up here and there which I’ll try to address in this article. While you’ll surely understand that I’m not going to spoil what’s coming up in the rest of the series (which we’re only through with by a quarter!), I’ll share some additional details.

Assuming the amount of questions stays at this volume, each article will be accompanied by a Q&A followup article a few days later.

As an aside, please understand that there is no one-size-fits-all solution for scalability problems or a walkthrough guide for your specific application needs. eins.de has certain characteristics which your application might not have.

For example, we store a lot of historic data for forums, personal messages, gallery comments, and more. On any given day, we have tens of thousands of rows being newly inserted while millions are already sitting in each table. This obviously affects query times and the need arises to temporarily store SQL results outside of the SQL service. Your application might have an entirely different concept there.

If you need help analyzing the characteristics and needs of your particular application, please drop me an email at patrick@limited-overload.de and we’ll work something out on a consulting level.

(Click the link for the Q&A.)

Wayne asks on the Rails weblog:

It is interesting that there is no backup capabilities mentioned in this diagram. How do you deal with backing up your data?

I already responded within the comments. For completeness’ sake, here it is again:

On the database side, backup is primarily handled through the replicated setup. Additionally, nightly dumps are created. All machines (obviously including the dumpfiles) are then backed up via rsync to a central backup repository not shown in the diagram.

Actually it’s even simpler. The only data that changes resides in the database (which is dumped as described) and on our NFS server (as shown in the diagram) which would include uploaded user photos, galleries, editorial images and so on. So the only boxes that get backed up via rsync are the NFS server and the dumps from one database server.

Thomas asks on the Rails weblog:

In part II of your write-up, would you make it much more technical in explain how you cluster (what software is used), how file replication is performed, how load balancing is perform (what software is used).

All software we use is actually mentioned in stage 1 of the write-up. There’s absolutely nothing missing.

We are using a single external IP address configured on the proxy box. lighttpd answers on port 80. If it’s a static request for a CSS file, a JS file or an image, it is served directly from the NFS repository which is mounted on each machine we have. If it’s a dynamic request, lighttpd forwards the request to one of the few dozen FastCGI listeners distributed among our 4 application servers.

That’s it. No file replication needed, no hardware or software load balancers, no cluster software.

As mentioned, we use NFS as shared storage (as shown in the diagram). The sitecode is held in Subversion and distributed to the relevant machines using Capistrano (the tool formerly known as SwitchTower).

Load balancing between the backend application servers is handled entirely in lighttpd (a snippet of the config is included in the write-up as well, exactly showing off this fact). It’s all a matter of a single fastcgi.server directive listing all your remote listeners with their ports.

Henry asks questions similar to the ones above in the poocs.net comments, including:

How did you setup MySQL to have a failover.

The original writeup has a link on how to setup MySQL for multi-master replication. This is all you have to do.

The concept behind it is simple. You use each of your servers as a replication slave of the other. By specifying sort of a namespace for auto-increment primary keys you make sure replication doesn’t collide when a record is inserted into the same auto-incremented table on both ends simultaneously. By spacing your masters by the amount of 10 apart you’ll end up with auto_created IDs that end in ..1 for MySQL server 1 and ..2 for MySQL server 2. That’ll all there is to it.

Additionally, we use a capistrano task to remotely switch all application servers to a specific database server in case of a hard- or software failure. Given you have sort of a template in your repository which has a placeholder {{ production_database_host }}, this task is as simple as this:


desc "Lock app servers to specific database host"
task :lock_db_host, :roles => :app do
  run <<-CMD
  ruby -pe '$_.gsub!("{{ production_database_host }}", "#{ENV["HOST"]}")' \
    #{current_path}/config/database.yml.tmpl > #{current_path}/config/database.yml
  CMD
  restart
end

From the command-line you may now use cap lock_db_host HOST=db-2

What are your application servers? Are they just web servers running lighttpd?

No, they’re not running lighttpd. They run standalone FastCGI listeners using the Rails combo spinner and spawner. See chapter 3 of the capistrano manual for examples.

john asks in the poocs.net comments:

so it looks like the majority of gains were in a re-architecting of the back-end, and not so much from using Rails-specific features ?

This is mostly true indeed. In fact, leaving hands off of Rails’ components feature was a nice performance gain on itself.

As mentioned in the article, Rails’ caching features were of little use for us since the dynamic nature of the site with tailored content based on user preferences was a clear no-match.

Sean asks in the poocs.net comments:

Could you go into more detail about the refactoring you did on your sidebar code so that it is no longer component based? [..] i.e. did you just make the sidebars partials?

No. While I don’t know how much of what I did will work for Typo (as Typo’s sidebars tend to be a lot more full featured and configurable), here’s a quick rundown.

First of all, I refactored the former controller rendered via components into a module, keeping all the controller methods named after each sidebar we have. All sidebars also have an accompanying view.

Then, in ApplicationController, I loop over the array of assigned blocks for a particular page invoking the appropriate method in the sidebar module and storing the rendered return value (rendered via render_to_string) for later display by a helper method.

Dick Davies asks in the poocs.net comments:

is that really only 1 lighttpd in the diagram? it’s not (http) proxying, it’s just using remote fcgis, right?

Absolutely true.

malcontent stated in the poocs.net comments:

How much of this was the fault of mysql. Could your application better handle the load with postgres or even oracle.

All of this is pure speculation. We’ve optimized FastCGI related items just as well as database handling in general and relieving the database from doing certain things specifically.

I don’t think web applications can be treated in binary form that way, it’s not like a certain application will only work in PostgreSQL *or* MySQL, let alone having the budget for an Oracle instance.

gumi would like to know in the poocs.net comments:

What did all this cost?

There’s no way on earth to accurately answer this question. I’ve given the specs of the hardware so you can have that calculated by your local hardware guys. Development time of the whole solution was roughly 4-5 months with some initial planning.

goyaves asks in the poocs.net comments:

What app did you use for the grpahics on this page ?

The diagrams have been setup in OmniGraffle 4.1 Pro on a Mac.

Caleb cries on the Rails weblog:

Please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please provide more information!!!

Of course. March 20 is only 2 days away.

Filed Under: Rails

16 comments

Jump to comment form

roly 03.18.06 / 18PM

Could you give us some stats as to how many requests per second your old setup handled, as well as your new setup?
Russ 03.18.06 / 20PM

Thank you so much for the follow up Q & A. I have a question.

When I read the original posting, it said:

“We found that in order to really have equally loaded application servers you should order your fastcgi.server directives by port and not by host, like so:

“http-1-01” => ( “host” => “10.10.1.10”, “port” => 7000 ), “http-2-01” => ( “host” => “10.10.1.11”, “port” => 7000 ), “http-3-01” => ( “host” => “10.10.1.12”, “port” => 7000 ), “http-4-01” => ( “host” => “10.10.1.13”, “port” => 7000 ), “http-1-02” => ( “host” => “10.10.1.10”, “port” => 7001 ), “http-2-02” => ( “host” => “10.10.1.11”, “port” => 7001 ), “http-3-02” => ( “host” => “10.10.1.12”, “port” => 7001 ), “http-4-02” => ( “host” => “10.10.1.13”, “port” => 7001 ), “

To me, it looks like only 2 ports are being used. Does this mean that load is being split by only 2 application servers? I’m somewhat confused because I see 4 different IP’s being used but only 2 ports. Could you explain a little more about how this works. Thanks
Sean 03.18.06 / 20PM

Question: If the primary MySQL died, how does the system “failover” to the replicated MySQL server?

Thanks for this Q&A post!
scoop 03.18.06 / 21PM

Russ: It’s only an excerpt. I didn’t feel like posting 2 pages of mostly repeating configuration lines. As the diagram shows, there are 4 application servers with numerous dispatchers each (the number changes across the series).

Sean: See the capistrano task in the Q&A article. There is no automated failover. We switch manually for now, which might change over time.
Kevin 03.18.06 / 22PM

[Question] You mention that NFS is used, but what is it used for? Since capistrano is being used, couldn’t the application just be pushed from Subversion to each application server instead of mounting a shared file system?

Thanks for your posts.
scoop 03.18.06 / 22PM

Kevin: As mentioned in the reply to Wayne in the article above, we’re storing uploaded user photos, galleries, and editorial images on the NFS.
Kevin 03.18.06 / 23PM

@scoop, thanks – I didn’t see that comment but now I do. I am looking forward to the next article
Doug W. 03.19.06 / 06AM

Q: What kind of load speed optimization have you done. I know you mentioned that previously you were using memcache. What is used now?

I ask because it is late Saturday night (presumably a low traffic time for any web sites) and eins.de/ is loading slow (approx. 6 seconds on my high speed cable modem).
scoop 03.19.06 / 08AM

Doug: Actually you’ve hit our backup window, since late Saturday night your time is 5am our time which is being used for backups because this is the lowest traffic time.
Evan 03.19.06 / 13PM

Thanks very much for posting this article and following up on the questions. This is really useful information for people who are considering moving to Rails and need some pointers on how real-world production environments are best architected.

I have a few questions for you myself:

1) From your diagram, it appears that memcached runs directly on the MySQL server. Is this a preferable approach to running memcached on its own box, or was this just done out of economic considerations/expediency?

2) Was there any reason in particular that you chose multi-master replication over clustering?

3) Is the multiple memcached problem one that others have experienced as well? This seems like a cause for concern, unless there’s work underway to fix it. One of the things that initially made me feel most comfortable about Rails was that memcached seemed to provide an elegantly scalable cache system.

Thanks in advance.
scoop 03.19.06 / 14PM

Evan:

1) Purely economic, although memcached is highly CPU friendly. The process hardly ever goes above 1%.

2) Define clustering. Without 3rd party software MySQL’s NDB system didn’t seem well established enough at the time. If you use stock replication slaves you have to hack Rails to use one box as a writer and the other(s) as a reader.

3) The series deals with that problem, so I won’t spoil it.
Dan Kubb 03.20.06 / 09AM

Hi Patrick,

I just wanted to say thank you for your article. I can’t wait for the next article in the series.

I was curious though if you made use of HTTP caching by setting the Expires, Cache-Control, Last-Modified and ETag headers? Done properly some of your responses can be cached and partially offloaded to caching proxies. At the very least web clients can avoid downloading rarely changing URIs as often.

Also, have you given any thought into handling Conditional GET requests? This sort of goes along with caching. Its sort of a fall-back for handling situations where the same URI is requested but the underlying state hasn’t changed since the last time the client requested it. If there is no change, you skip rendering the view and just return a “blank” page with a 304 Not Modified status.

In many rails apps a large percentage of the time can be spent in the rendering, so skipping that step with Conditional GET handling has always made a pretty good improvement in the benchmarks I’ve done. Plus there’s a pretty nice improvement on the client side because almost no information was transfered, and it could just bring up the page from its cache.

Anyway it might be something worth looking at, and I’d be really curious to see if it makes a difference for you.
scoop 03.20.06 / 09AM
Dan: The next article’s out already :)

Yes, we indeed set custom expiration headers for static files via lighttpd:
```
$HTTP["url"] =~ "(gif|jpg|png|css|js)$" {
  expire.url = ( "/" => "access 8 hours" )
}
```
This, however, is totally handled within lighty which doesn’t really care all that much if you throw a million or two requests at it. It saves bandwidth and page loading times though :)

I haven’t personally looked into conditional GET requests. Is there any code to look at or re-use for that matter?
Ian 03.20.06 / 16PM

Scoop – Have you benchmarked your site using utilities like ‘ab’. If so, how many requests/second can your site handle?

I have found that I can get about 25 req/second on an action that queries the DB, and 100req/sec on an action that does not hit the DB. This is on a single lighttpd/fastcgi server.
scoop 03.20.06 / 16PM

Ian: No, we haven’t. I consider these to be rather theoretical values. We’ve optimized for live usage, more of which will be explained in stage 3.
gabriele renzi 03.29.06 / 18PM

Sorry if this is a dumb question, but I wonder what was your setup back in the php days?

poocs.net

Q&A: The adventures of scaling, Stage 1 March 18th

16 comments

Archives