Robot Co-op Hardware
Posted by Eric Hodel Thu, 16 Mar 2006 02:12:00 GMT
There’s been interest in the hardware that has driven the sites of The Robot Co-op over 2.5 million requests/day so here it is:
Quantity | CPU | Memory | Disks | Functions |
---|---|---|---|---|
4 | Dual 3GHz Xeon | 6GB | 70GB RAID 1 | Apache, FastCGI, MogileFS storage node, memcached, image serving |
1 | Dual 3GHz Xeon | 2GB | 70GB RAID 1 | Staging, mail, backend jobs |
1 | Dual Opteron 246 | 12GB | 5×73GB in RAID 5 | MySQL |
The four web servers are more fluke than planning, we don’t need the capacity they have just yet. We started with two webservers, a database server and a staging/mail/backend server, all dual 3GHz Xeons. We then added a third webserver and after that the Opteron MySQL box. The old database server was recently repurposed as a webserver.
Site traffic is currently spread across all four web boxes as each box runs all of our sites by a hardware load balancer of unknown manufacture. Eventually we’ll switch to running the 43 Things on a pair of machines and all other sites on the remaining machines.
Images are routed through a separate IP directly to WEBrick running a custom HTTPServlet that interacts with MogileFS to serve and resize images.
Posted in Robot Co-op | 29 comments
Comments
-
Good info. What’s the load been like?
Joe said about 1 hour later: -
And why are images routed separately?
Joe said about 3 hours later: -
Load hovers between 0.6 and 1.0. With HTT enabled a load of 4 would have four processes on the run-queue matching the 4 CPUs in the box, so a load of 2 would indicate that its time to add another box (for fail-over capacity).
Images are routed separately because they’re served via WEBrick. Using Apache’s mod_proxy adds 20% overhead.
Eric Hodel said about 3 hours later: -
And why not lighttpd?
Joe said about 3 hours later: -
Most importantly, lighttpd doesn’t have a mod_deflate equivalent (only compression of static content). Compressed pages save lots of money.
I also hate extra software so I’d have to either deal with the pain of extra software to worry about or reconfigure all our internal tools that use apache to use lighttpd.
Eric Hodel said about 3 hours later: -
what OS and filesystem for the Opteron, how much you giving to Innodb’s bufferpool, and what does iostat look like ? I assume since you’ve got the monster hunk of RAM that you’re doing some decent qps on it.
question: Are you disk I/O bound at all ?
In my experience, RAID5 can give some really awful performance, and RAID10 is about the best you’ll get. I’d be interested to hear the actual numbers.
john said about 4 hours later: -
What’s with all the “Why this and not that?” questions?
If it works for them, it is a perfectly valid setup.
Observer said about 5 hours later: -
john: All FreeBSD 6 on UFS2 as I prefer my safety belts. The Opteron is in amd64 mode. The database sits all in RAM so the machine is nearly idle.
I’m not into premature optimization and simplicity suggests spreading writes across as many spindles as possible. If it becomes a problem I’ll address it then.
Bob tuned innodb based on wikipedia’s configuration.
Eric Hodel said about 6 hours later: -
FYI,
There is a patch to add mod_deflate to Lighttpd. Its listed as a patch for 1.4.10. Might want to give it a shot just for kicks.
Adam said about 7 hours later: -
I apologize for the questions, and if it works for you, rock on. :) And since the DB can all fit into RAM for you, disk I/O won’t be an issue of course.
My experience has been with the db being much bigger than RAM, and very IO-bound to disk, and so when I see RAID5 I get twitchy flashbacks of some very long and tough nights. :)
Thanks for humoring my questions.
john said about 11 hours later: -
eric: i don’t think HTT gives you the extra capacity you think it does. for example (this is what i did), use ab (or similar load-generating tool) to generate a steady load of 2.0 on your box, once with HTT enabled and once with it turned off. i believe you’ll find (as i did) that you get roughly the same number of requests/second served regardless of HTT setting. that’s because you don’t actually double the number of compute cores when you enable HTT, it just looks that way to the OS.
so, if you want to keep your servers running at 50% of max capacity, you really want to be adding nodes when your load average hits 1.0, because a load of 2.0 is really 100% capacity.
Brandt said about 11 hours later: -
just fyi: in the case of the database box, loads can still get quite high without affecting performance, mostly because 64bit boxes are monsters when it comes to context switching.
At various times, we’ve had here many (>10, up to 25) 64bit (16GB) MySQL slaves with loads >>5 that keep with with replication fine, and still serve SELECT traffic for the site.
john said about 12 hours later: -
Hi, did you need to modigy MogileFS to get it to work on FreeBSD? I thought that it is currently only working on linux.
Thanks for the information.
jtoy said about 14 hours later: -
Brandt: I tested HTT performance and found about a 20% improvement.
Eric Hodel said about 15 hours later: -
jtoy: I use NFS mode with a WEBrick server to handle the trackers’ usage checks. Fortunately FreeBSD has a decent NFS implementation. IO::AIO and FreeBSD hate each other for some unknown reason so I can’t use Perlbal.
Eric Hodel said about 15 hours later: -
Why is the manufacturer of the hardware load balancer unknown? (Disclaimer: I am an engineer at a company that makes hardware load balancers)
Tom said about 20 hours later: -
Thanks Eric. It is fantastic to see these real-life figures as they are so much more useful than theoreticals.
John Röthlisberger said about 21 hours later: -
Tom: We let Rackspace handle all the hardware and the console doesn’t say what it is. I think it is a Cisco, but I can’t say for certain. (Note: F5 is in my backyard, and I practice Jujutsu with a guy who works there.)
Eric Hodel said about 22 hours later: -
Do you replicate session information with MogileFS?
Keith Veleba said about 22 hours later: -
How many FastCGI server processes per web server?
Keith Pitty said 1 day later: -
Keith V: No, we only use it for an image store.
Sessions are stored in memcached.
Eric Hodel said 1 day later: -
Keith P: I’ll post a rough outline of our software configuration after I’m feeling well again.
Eric Hodel said 1 day later: -
I use Rackspace also – great company although pricey. I didn’t know they did freeBSD though.
Zack Chandler said 1 day later: -
Great post, thanks for the info! It’s interesting to compare setups, I’ve written about our setup here:
gabrito.com/post/website-hardware-and-our-setup
Todd Huss said 1 day later: -
Curious are you using Apache2 + mod_fcgi or mod_fastcgi? Worker or prefork? And I doubt it, but any PHP running around on those servers?
Great to read though.
Paul said 1 day later: -
Eric, interesting that you guys don’t have a backup database server. I’m assuming that you guys are using a battery backed RAID controller at very least?
Chuq said 1 day later: -
regarding system load. it a very unrepresentive metric. you can have a box with a load of 500 which will be running almost like it has no work to do at all.
A good example is a forking ftp server. The ftp processes are lightweight and just do i/o in little spurts. High load, but FreeBSD just kicks ass for this…
or a box with a flat load of 1.00 with a single process which sits on cpu all the time and is running at 100% of the systems capacity. any kqueueified single process daemon, such as thttpd.
i find mysql is very good at operating under load on freebsd6. I maintain a bacula backup server for our business and the some of tables used for the backup db have a lot of rows (10million+). I can run 40 concurrent backup jobs into this db – AND run a table optimise query at the same time and the box sits about 60% CPU, disc on 20MB/s.
Did I mention, the box is a 2GHz P4 with 2GB RAM and a ATA100 drive? :)
... and its my KDE desktop too.
FreeBSD++
djwm said 2 days later: -
The load balancer is most likely a Cisco CSS series device. Cisco is our preferred vendor for almost all of our network devices. (I work for Rackspace =D)
Doug D said 11 days later: -
> Most importantly, lighttpd doesn’t > have a mod_deflate equivalent It’s also a weird code, every person on our team who was first admired with it later regretted it for various reasons (not handling ”+” in filenames or whatever architectural problems).
No one has got off nginx as a frontend/static server though, it’s very robust. Definitely worth looking at.
Michael Shigorin said 19 days later:
RSS feed for this post
Comments are disabled