Upgrading Servers

So, my only real dot-com tech splurge in 2000 was a set of 3 shiny new Athlon 700 systems with 30 GB drives and 256 MB of RAM. One was a desktop, one was a web server, and one was a file server.

The desktop started gathering dust when I bought my first PowerBook in 2002. The file server survived a bit longer before it failed. The web server, however, is still in use, 9 years later. It’s been maxed out on RAM for about 6 of those years, with a whopping 768 MB. At one point, it was my router, web server, mail server, bittorrent client, jabber server, and Asterisk VoIP server, all at the same time.

Unfortunately, when you cram that many things onto one system, eventually the complexity comes back to haunt you. I couldn’t upgrade my blog at one point because a different service on the same box was incompatible with newer Ruby interpreters. Some days Asterisk will refuse to start after a reboot. Sometimes NFS causes kernel panics. Apache never shuts down right on reboot, usually forcing the use of the reset button. I can fix any one of these, but it’s like flattening out bubbles in wallpaper–new bubbles always pop up somewhere new. There’s a limit to how long you can maintain a single Linux install, and 9 years is way, way past this one’s sell-by limit.

So, once FiOS finally arrived in my neighborhood, I decided it was time to start replacing things. I didn’t think the old Athlon could really keep up with 20 Mbps of traffic, anyway.

So, I started by building a new router. I considered buying one, but getting a reasonably fast Cisco or Juniper with all of the licenses needed for NAT and SIP proxying turns out to be painfully expensive. So instead I ended up installing Vyatta on a cheap PC. Vyatta’s great–it’s a router-specific Linux distribution that gives you a Juniper-ish CLI for managing your router config. The hardware is vastly over-specced, with a 2.5 GHz dual-core CPU and 4 GB of RAM, but hopefully I won’t need to replace this one for another 9 years.

Once that was done, I needed a system (or systems) to run all of the other little services that had piled up over the years. I’ve always been a big believer in partitioning services onto their own servers, to provide more isolation and more control, but it’s silly to have a dozen physical machines sitting around the house. My experiments in virtualization in the past were never completely successful–I have a machine running Xen here, but upgrading Xen itself is always nerve-wracking, because there’s no good way to test changes that will break multiple VMs. I did a bunch of research, looking for a virtualization system that would let me run N VM instances over M physical machines, with a single interface for managing them and migrating VMs between machines. The enterprise version of most of the virtualization systems can do this, but I had a hard time finding anything under $3k that could do it. That is, until I stumbled across Ganeti.

Ganeti is an open-source Xen cluster management system, originally developed by Google. You give Ganeti a pool of servers, and you tell it about the VMs that you want, and it takes care of the details. If you want, you can set up VMs with their disk replicated between a pair of servers, and Ganeti will handle migrating running VMs between machines, so you can take the underlying hardware down for maintenance.

So, I bought a pair of cheap servers (Phenom II X3 720, 8 GB RAM, 500 GB disk, 2 GigE interfaces–about $400 each) and installed Ganeti on them, and I’ve been slowly moving services onto the new machines. I started with easy things, like recursive DNS service. I created a pair of VMs, one on each system, and set them up as basic recursive DNS servers. It’s overkill, but I’m a big fan of overkill, at least when it’s cheap.

Since then, I’ve moved my HTTP reverse proxy/load balancer over to Ganeti, created a local git repository VM, moved my blog, bittorrent, logging, and so forth over. I’m up to 10 VMs on Ganeti, with about 5 left to go.

My general rule of thumb is that if you have 2 servers, then using some sort of automated management platform is more work than it’s worth. With 5 servers, it’s probably still too much work. With 10, you’ll probably see a benefit, and beyond that you’re going to suffer if you have to manually manage things. Since I’m looking at 10-20 VMs eventually, I figured it’d be worth the time to look around and play with some new tools. I’m currently playing with Puppet. It’s not perfect, but it seems good enough to far. I’ve put together a set of puppet templates that describe a basic server, and then used that to define various server types (web server, DNS server, etc). So, setting up a new VM reasonably easy now–I ask Ganeti to build me a new Debian VM with some amount of disk and RAM, and then I tell puppet to take over. Puppet will install an extra couple dozen packages, set passwords, set up the right sudoers file, and so forth. My puppet config lives in git, so it’s revision controlled and replicated to multiple systems. Even better, puppet will keep pushing the same content out to my systems over time, making sure that all updates show up everywhere they’re needed. It’s not perfect, but it’s good enough for now.

So that’s where things stand. I’m actually really happy with the system at a whole. Vyatta makes a very nice router, Ganeti is great for managing small clusters of virtual machines, and Puppet does an okay job at corralling the VMs. I’ll write more on the specific details later, including some of my puppet configs, more on Vyatta, and how I’m doing load balancing.