Posted by Scott Laird
Wed, 06 Apr 2005 14:10:19 GMT
I finally have Xen working on a system at home. I hadn’t expected this to be very difficult, but apparently Xen doesn’t like my new Athlon 64 system (bought mostly for running Xen). They’ll fix it eventually, but for now I’m using an old Athlon 700 system that I had sitting around. It needed a new CPU fan (just try finding Slot A fans these days!), but I was able to scrounge up 512 MB of RAM and an 80 GB hard drive, so it’s perfectly usable.
I built a couple quick disk images and booted them under Xen, and everything worked as expected. This is always a good sign, and it suggests that I’ll be able to make progress on my little virtual-server project without a whole lot of trouble.
Posted in Xen, LWVS | Tags linux, virtualization, xen | 1 comment
Posted by Scott Laird
Tue, 08 Mar 2005 00:36:14 GMT
I’ve been watching Xen for a while now, and I’m nearly ready to take the jump and do some testing with it. I’m thinking about ordering a cheap Athlon 64 box for home to use as a testbed for the lightweight server concept that I’ve been kicking around for years. In the 18 months that have passed since I last talked about it, virtualization on the PC has advanced by leaps and bounds; at the time, I was looking at UML, which wasn’t really fast or stable enough. Xen looks to be both fast and stable, and it has a clear migration path onto the virtualization hardware offered by the next generation of PC hardware. That makes it nearly ideal for my purposes.
Posted in Linux, Xen, Computer System Administration, LWVS | Tags sysadmin, xen | no comments
Posted by Scott Laird
Wed, 10 Sep 2003 18:25:58 GMT
I try to be pragmatic, but sometimes I just can’t help it and try to pick up lost causes. I think the world would be a better place if computers were easier to maintain. Fortunately, I’m a server person, and the server side of things is actually a lot easier then the desktop side, at least for now.
Warning: most of my experience is with Linux and Solaris boxes in ISP-like settings, although I’ve done a fair bit of time in small non-computer-related businesses and software houses. I have no idea how much of this applies to Windows.
I’ve been thinking about server management for years. Sometimes, I’ve been paid for it, sometimes (like now), I’m paid for other things. I still can’t stop thinking about it, though. There has to be a better way to manage servers then we’re doing now. As I mentioned yesterday, I think I might have a solution for at least a few common cases.
Traditionally, there are two models for server deployment. Either the heavyweight model (deploy a small number of servers and run lots of services on each) or the lightweight model (deploy a lot of servers, and run a small number of services on each). One of the problems is that, at least for small services, the heavyweight model seems cheaper. Why buy 10 servers that are going to sit 95% idle when you could buy 2 servers and have them be 75% idle? Or even one server that’ll only be 50% full. What happens pretty much every time is that a couple of the services start conflicting with each other somehow–one needs perl 5.6 for something, while another needs 5.8. Or they need two different versions of the JVM. Or one needs a critical security upgrade that ends up killing another service. So, you keep tweaking things, and you (barely) keep everything running, largely by avoiding making changes. Except, when you avoid making small changes, you inevitably miss little security fixes and little bug fixes, and you drift further and further from the mainline of whatever OS you’re running. So, inevitably, you reach the “server event horizon,” where things have grown so complex and unmanageable that the only thing you can do is buy 2-3 new computers to replace your one big system, and then slowly migrate services off of the old box onto the newer box. Except you end up with a lot of implicit assumptions lurking, assuming that DNS and DHCP are on the same server, or that Apache and Mysql are on the same box, and it takes forever to untangle them. Even once that’s done, you’ll find out that people have hard-coded server names into applications deployed all over the company, and you’ll end up spending 3 months untangling your one heavy-weight server that seemed like such a good way to save money at the time.
Conventional wisdom says that the way out of this problem is virtualization. Instead of buying 10 small computers, you buy one or two really big computers, and then partition them in software, and then install the software that you would have installed onto the little computers onto the partitions of the big computers. Lots of vendors love this model; IBM’s whole Linux-on-mainframes push is based on it. There are a couple problems with it as I see it, though. First, you’ll end up paying a ton of money for virtualization hardware or software–VMWare wants at least $2,500 per server for their PC-based virtualization code; pretty much everything else else is more expensive. Second, you’re still left with a bunch of small general-purpose servers that you need to manage individually, even if they do happen to physically reside within a single box. There are also reliability issues, but I’m going to ignore them for now; in my experience, even cheap PCs running Linux rarely crash, and when they do it’s usually a bad power supply, a bad hard drive, or bad RAM. Spending more money on hardware gives you multiple power supplies, better RAID, and more redundant memory. Plus buggy virtualization software, but we’ll come back to that, too.
Fortunately, the open-source world is making progress. User-mode-linux (UML) is making a lot of headway. It’s included in Linux 2.6, although it still needs a few little patches for optimum operation. It seems to have a 30% speed hit in a lot of cases; sometimes that’s a problem, sometimes it isn’t. Using it, you can build a big Linux host server, and then run a bunch of little virtualized servers on it for free. Sounds nice? Sort of–you still have to admin a ton of little general-purpose boxes, but at least you’ve mostly solved the dependancy problem that killed us a few paragraphs ago.
The nice thing about Linux is that it’s so flexible. Unlike every other OS that I’m aware of, there’s no one environment that is definitively Linux. Instead, we have a herd of Linux distributions, ranging from Red Hat to Debian to Gentoo to “Linux From Scratch” all the way down to the mini Linux distribution that wireless access point vendors call “firmware.” There’s no real reason for a special-purpose DNS server to run a full Linux distribution, except that it’s usually less work that way. However, once we have a UML-based virtualization scheme in place, it can actually be easier to use specialized distributions then general-purpose ones. I mean, the hard parts of a distribution are generally the installer, the hardware handling, and the update code. With a virtualized server, none of that applies. There is no hardware, really (it all pretends to be the same), the installer is really just a script that copies a hard disk image into place on the host system, and the update system is even easier–just save the data and completely discard the old OS image. In an ideal world, the OS image would be completely read-only, with configuration settings and data kept outside of the server in a standardized format. Then, software upgrades are truly trivial–kill off the old server VM and start up a new VM using the old data.
This won’t work for everything, of course. It’d be horrible for big database servers, or frankly big servers of any type. In my experience, though, there are a lot more small servers then there are big servers.
The other nice thing about this scheme is that the virtual server images are simple to build and easy to trade. They’re not utterly trivial, but it’s easier to build a server image then to actually write the server software or or maintain a full-sized OS distribution. Given a standardized interface between the host OS and the server image (things like IP address, DNS server, hostname, logging, and all of the other little details needed to make a server run), there’s no real reason that you can’t swap between server images from different “vendors”, grabbing whatever best serves your needs.
I’m starting to build a framework for this, more details as I have time to write them down.
Posted in Linux, LWVS | Tags management, management, server, server, servermanagement, servermanagement, sysadmin, sysadmin | 1 comment
Posted by Scott Laird
Wed, 10 Sep 2003 01:14:42 GMT
Once a sysadmin, always a sysadmin. I’m not really sure why (although I suspect the folks on alt.sysadmin.recovery would call it a character defect), but even though I’m mostly a programmer this year, visions of well-managed servers are still dancing through my head. Maybe they’ll stop someday.
Anyway, I’ve been stewing over some interesting ideas on server management recently, and I think we’ve been doing it all wrong. We’ve been trying to build strong, resilient, flexible servers that we can maintain for years, adding and removing services as needed. This is an outgrowth of the way programmers are trained to think, with maintainability prized as once of the highest virtues of software. The XP people have a slightly different take on things, but underneath it all they still prize maintainable systems. On c2.com’s wiki, the discussion of Christoper Alexander’s A Pattern Language is informative, as it shows parallels between software design and building design, concentrating on features in buildings that give them a long, useful life.
I’m not sure that we really want any of that for servers, though, or at least not for standard, cookie-cutter services like DNS. We should be aiming for disposable servers, where the only things that we care about are the configuration state, data, performance, and security. The actual system files, and even OS should be irrelevant, and even ignored if possible.
I’m working on a demonstration, along with a design for a larger-scale system that have some bizarrely nice properties. I don’t see why it wouldn’t work, frankly, and it’ll make a huge change in the amount of work needed to implement small services in networks.
Posted in Linux, Computer System Administration, LWVS | Tags sysadmin | no comments