Segment7 - The Blog

edit

Robot Co-op Hardware

Posted by Eric Hodel Thu, 16 Mar 2006 02:12:00 GMT

There’s been interest in the hardware that has driven the sites of The Robot Co-op over 2.5 million requests/day so here it is:

QuantityCPUMemoryDisksFunctions
4Dual 3GHz Xeon6GB70GB RAID 1Apache, FastCGI, MogileFS storage node, memcached, image serving
1Dual 3GHz Xeon2GB70GB RAID 1Staging, mail, backend jobs
1Dual Opteron 24612GB5×73GB in RAID 5MySQL

The four web servers are more fluke than planning, we don’t need the capacity they have just yet. We started with two webservers, a database server and a staging/mail/backend server, all dual 3GHz Xeons. We then added a third webserver and after that the Opteron MySQL box. The old database server was recently repurposed as a webserver.

Site traffic is currently spread across all four web boxes as each box runs all of our sites by a hardware load balancer of unknown manufacture. Eventually we’ll switch to running the 43 Things on a pair of machines and all other sites on the remaining machines.

Images are routed through a separate IP directly to WEBrick running a custom HTTPServlet that interacts with MogileFS to serve and resize images.

Posted in Robot Co-op | 29 comments

Comments

  1. Joe said about 1 hour later:

    Good info. What’s the load been like?

  2. Joe said about 3 hours later:

    And why are images routed separately?

  3. Eric Hodel said about 3 hours later:

    Load hovers between 0.6 and 1.0. With HTT enabled a load of 4 would have four processes on the run-queue matching the 4 CPUs in the box, so a load of 2 would indicate that its time to add another box (for fail-over capacity).

    Images are routed separately because they’re served via WEBrick. Using Apache’s mod_proxy adds 20% overhead.

  4. Joe said about 3 hours later:

    And why not lighttpd?

  5. Eric Hodel said about 3 hours later:

    Most importantly, lighttpd doesn’t have a mod_deflate equivalent (only compression of static content). Compressed pages save lots of money.

    I also hate extra software so I’d have to either deal with the pain of extra software to worry about or reconfigure all our internal tools that use apache to use lighttpd.

  6. john said about 4 hours later:

    what OS and filesystem for the Opteron, how much you giving to Innodb’s bufferpool, and what does iostat look like ? I assume since you’ve got the monster hunk of RAM that you’re doing some decent qps on it.

    question: Are you disk I/O bound at all ?

    In my experience, RAID5 can give some really awful performance, and RAID10 is about the best you’ll get. I’d be interested to hear the actual numbers.

  7. Observer said about 5 hours later:

    What’s with all the “Why this and not that?” questions?

    If it works for them, it is a perfectly valid setup.

  8. Eric Hodel said about 6 hours later:

    john: All FreeBSD 6 on UFS2 as I prefer my safety belts. The Opteron is in amd64 mode. The database sits all in RAM so the machine is nearly idle.

    I’m not into premature optimization and simplicity suggests spreading writes across as many spindles as possible. If it becomes a problem I’ll address it then.

    Bob tuned innodb based on wikipedia’s configuration.

  9. Adam said about 7 hours later:

    FYI,

    There is a patch to add mod_deflate to Lighttpd. Its listed as a patch for 1.4.10. Might want to give it a shot just for kicks.

  10. john said about 11 hours later:

    I apologize for the questions, and if it works for you, rock on. :) And since the DB can all fit into RAM for you, disk I/O won’t be an issue of course.

    My experience has been with the db being much bigger than RAM, and very IO-bound to disk, and so when I see RAID5 I get twitchy flashbacks of some very long and tough nights. :)

    Thanks for humoring my questions.

  11. Brandt said about 11 hours later:

    eric: i don’t think HTT gives you the extra capacity you think it does. for example (this is what i did), use ab (or similar load-generating tool) to generate a steady load of 2.0 on your box, once with HTT enabled and once with it turned off. i believe you’ll find (as i did) that you get roughly the same number of requests/second served regardless of HTT setting. that’s because you don’t actually double the number of compute cores when you enable HTT, it just looks that way to the OS.

    so, if you want to keep your servers running at 50% of max capacity, you really want to be adding nodes when your load average hits 1.0, because a load of 2.0 is really 100% capacity.

  12. john said about 12 hours later:

    just fyi: in the case of the database box, loads can still get quite high without affecting performance, mostly because 64bit boxes are monsters when it comes to context switching.

    At various times, we’ve had here many (>10, up to 25) 64bit (16GB) MySQL slaves with loads >>5 that keep with with replication fine, and still serve SELECT traffic for the site.

  13. jtoy said about 14 hours later:

    Hi, did you need to modigy MogileFS to get it to work on FreeBSD? I thought that it is currently only working on linux.

    Thanks for the information.

  14. Eric Hodel said about 15 hours later:

    Brandt: I tested HTT performance and found about a 20% improvement.

  15. Eric Hodel said about 15 hours later:

    jtoy: I use NFS mode with a WEBrick server to handle the trackers’ usage checks. Fortunately FreeBSD has a decent NFS implementation. IO::AIO and FreeBSD hate each other for some unknown reason so I can’t use Perlbal.

  16. Tom said about 20 hours later:

    Why is the manufacturer of the hardware load balancer unknown? (Disclaimer: I am an engineer at a company that makes hardware load balancers)

  17. John Röthlisberger said about 21 hours later:

    Thanks Eric. It is fantastic to see these real-life figures as they are so much more useful than theoreticals.

  18. Eric Hodel said about 22 hours later:

    Tom: We let Rackspace handle all the hardware and the console doesn’t say what it is. I think it is a Cisco, but I can’t say for certain. (Note: F5 is in my backyard, and I practice Jujutsu with a guy who works there.)

  19. Keith Veleba said about 22 hours later:

    Do you replicate session information with MogileFS?

  20. Keith Pitty said 1 day later:

    How many FastCGI server processes per web server?

  21. Eric Hodel said 1 day later:

    Keith V: No, we only use it for an image store.

    Sessions are stored in memcached.

  22. Eric Hodel said 1 day later:

    Keith P: I’ll post a rough outline of our software configuration after I’m feeling well again.

  23. Zack Chandler said 1 day later:

    I use Rackspace also – great company although pricey. I didn’t know they did freeBSD though.

  24. Todd Huss said 1 day later:

    Great post, thanks for the info! It’s interesting to compare setups, I’ve written about our setup here:

    gabrito.com/post/website-hardware-and-our-setup

  25. Paul said 1 day later:

    Curious are you using Apache2 + mod_fcgi or mod_fastcgi? Worker or prefork? And I doubt it, but any PHP running around on those servers?

    Great to read though.

  26. Chuq said 1 day later:

    Eric, interesting that you guys don’t have a backup database server. I’m assuming that you guys are using a battery backed RAID controller at very least?

  27. djwm said 2 days later:

    regarding system load. it a very unrepresentive metric. you can have a box with a load of 500 which will be running almost like it has no work to do at all.

    A good example is a forking ftp server. The ftp processes are lightweight and just do i/o in little spurts. High load, but FreeBSD just kicks ass for this…

    or a box with a flat load of 1.00 with a single process which sits on cpu all the time and is running at 100% of the systems capacity. any kqueueified single process daemon, such as thttpd.

    i find mysql is very good at operating under load on freebsd6. I maintain a bacula backup server for our business and the some of tables used for the backup db have a lot of rows (10million+). I can run 40 concurrent backup jobs into this db – AND run a table optimise query at the same time and the box sits about 60% CPU, disc on 20MB/s.

    Did I mention, the box is a 2GHz P4 with 2GB RAM and a ATA100 drive? :)

    ... and its my KDE desktop too.

    FreeBSD++

  28. Doug D said 11 days later:

    The load balancer is most likely a Cisco CSS series device. Cisco is our preferred vendor for almost all of our network devices. (I work for Rackspace =D)

  29. Michael Shigorin said 19 days later:

    > Most importantly, lighttpd doesn’t > have a mod_deflate equivalent It’s also a weird code, every person on our team who was first admired with it later regretted it for various reasons (not handling ”+” in filenames or whatever architectural problems).

    No one has got off nginx as a frontend/static server though, it’s very robust. Definitely worth looking at.

RSS feed for this post

Comments are disabled

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.