Posted by Scott Laird
Tue, 23 Oct 2007 00:06:37 GMT
Interesting trivia: staggered drive spin-up is an optional part of the SATA II spec. Unlike most SCSI drives, though, it’s not controlled through a jumper. Instead, it uses one pin on the SATA power connector. If pin 11 is floating, then the drive is supposed to wait to start spinning.
Apparently the 650W power supply in my new server only provides enough current to spin up 10 drives at once, because adding an 11th drive makes it turn off on its own immediately. Sigh. I wonder if anyone makes delayed-spin SATA power dongles?
Tags power, sata, server | 5 comments
Posted by Scott Laird
Sun, 21 Oct 2007 03:33:00 GMT
I now have Solaris up and running and reasonably stable-looking, after only 12 hours of work. A number of things turned out to be bigger issues than I’d anticipated, largely because it’s been years since I last used Solaris and, frankly, Solaris’s disk partitioning and formatting tools suck.
- My first problem is still unresolved: my BIOS refuses to boot from the IDE DVD drive that I installed. Once the system boots, it works just fine, so I’m not sure what’s up. Maybe a BIOS bug. Fortunately, the system’s perfectly happy booting off a USB DVD drive, and (amazingly) Solaris is happy installing from it.
- The GC-RAMDISK card that I was looking forward to testing is a complete failure so far. I don’t know if I have a bad card or if it’s simply incompatible with both SATA chips in the system, but the BIOS completely ignores it if its plugged into the motherboard, and Solaris fails to talk to it on either bus. If it’s plugged into the MB, then I get a
device failed initialization error; if it’s plugged into the PCI-X SATA card then I get device on port 5 still busy after reset. I’ve swapped cables and RAM. I’d really like to get it to work, so I’m going to try it with an older system before RMAing it.
- Actually getting a working Solaris install took me 3 tries. The first time I installed it to the wrong drive (the first disk on the PCI-X card, not the first disk on the motherboard), and it was unable to mount the root partition after rebooting. Next, I managed to install it onto an EFI partition, and that wouldn’t boot either. Finally, I installed it onto the second drive on the right bus, and that worked.
- Since Solaris’s installer doesn’t support ZFS yet, I had to manually copy the root filesystem onto a newly created ZFS filesystem mirrored across a pair of drives. The directions are helpful, but I kept screwing things up. First, I accidentally created the ZFS pool using the entire disk, which made ZFS re-label the drives with EFI, which makes them unbootable. Then, I missed the line in the directions that says to run
format -e instead of using format; that left me with a pair of nicely partitioned drives that still used EFI. The third try worked, and the system is now booting off of ZFS via GRUB without problems. Er…
- Well, one problem–I can’t change the GRUB
menu.lst file for some reason. I don’t know where GRUB is looking for it, but it’s not in /boot/grub/menu.lst on my boot array. My changes are being completely ignored. I can live with this for the weekend.
- OpenSolaris doesn’t ship with drivers for the ASUS P5K WS’s onboard Ethernet chips. I had to grab them from Marvell, but it was easy enough to install.
- Creating an 8-drive ZFS filesystem is trivial. One command takes care of RAID, logical volume management, creates the filesystem, and mounts it:
zpool create -f space raidz2 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c3t2d0 c3t3d0 c3t4d0 c3t5d0.
- ZFS performance is decent. Here’s a bonnie with and without ZFS compression, using 10 GB of data on a box with 2 GB of RAM:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU
zfs 10 105.0 55.3 163.3 27.3 121.0 30.4 119.2 88.4 287.1 36.2 169 1.8
zfs+c 10 112.9 59.7 181.5 30.3 127.8 29.1 118.1 86.0 424.9 52.2 198 2.1
- 163 MB/sec writing and 287 MB/sec reading is good enough for me. I was expecting slightly higher numbers, but there’s nothing here to complain about. Adding compression improves writing a bit and makes a big difference reading. It’s quite a bit faster then GigE, which was my goal.
Tags gs, opensolaris, ramdisk, server, zfs | 7 comments
Posted by Scott Laird
Sat, 20 Oct 2007 04:56:19 GMT
A few days ago, I mentioned that my home NAS box had failed, and that I was considering replacing it with a PC server running OpenSolaris and ZFS. I’ve read a pile of ZFS docs, and it looks like the best option available to me today, so I decided to order some suitable hardware.
At that point, pretty much everything broke down. I have a hard enough time keeping track of which hardware works with Linux this week, and OpenSolaris is completely new to me. Sun’s list of officially-supported hardware is pretty sparse, and digging through their mailing list archives gets frustrating quickly. From what I can tell, it boils down to:
- Current Intel and AMD CPUs are all fine.
- Most of Intel’s chipsets are fine.
- Most of nVidia’s AMD chipsets are fine.
- nVidia and Intel video chips are good.
- Most common Ethernet chipsets are either supported natively or have drivers available.
- The only SATA controllers that work are Intel’s ICH southbridges, Silicon Image’s PCI and PCI-E chips, Marvell’s PCI chips, and nVidia’s southbridges. It’s not clear that Marvell’s PCI-E chips work. Most motherboards with additional, non-southbridge SATA ports probably won’t work.
- Venturing too far outside of this list will probably result in problems.
I was looking for a motherboard with 8 SATA ports, and was hoping that the Intel D975XBX2 (“Bad Axe 2”) would work, but 4 of its 8 SATA ports belong to a Marvell PCI-E SATA chip that doesn’t appear to be supported. I went through every single 8-port motherboard in Newegg’s (the ‘WS’ is important–the P5K is a different board). It only has 6 on-board SATA ports, but it includes a PCI-X slot. That’ll let me use the Supermicro AOC-SAT2-MV8, which is far and away the cheapest 8-port SATA card on the market. That’ll give me a total of 14 SATA ports, which should be enough for a whatever I want to throw at it. The Marvell PCI-X chip at the heart of the Supermicro card is the same one used in Sun’s Sun Fire x4500 48-drive server, so it’s safe to assume that Sun has put a lot of effort into the driver.
Most of the test of the system is fairly generic–a cheap nVidia 7200GS video card (the cheapest PCI-E card that NewEgg carries), a nice case and power supply, RAM, and a boatload of drives.
The one odd component that I’ve added is a Gigabyte GC-RAMDISK with 1 GB of RAM. The GC-RAMDISK is a battery-backed SATA ramdisk; it looks like a hard drive to the system and can survive up to 18 hours without power. I’ve had my eye on this thing for years, and it looks like it’ll be a perfect external log device for GFS. I had to ask to see how ZFS will behave if the device fails, and it looks like manual intervention may be required after an 18+ hour power outage, but it should be pretty minimal. I’m planning on posting some benchmarks here once I’ve had a chance to try it out.
Assuming that I’m able to get this whole mess to work at all, I should have lots to write about here over the next week or so. I’m going to start by explaining why I want to use Solaris instead of Linux or *BSD, and why I’m building something instead of buying a pre-build NAS box.
Tags home, opensolaris, raid, server, zfs | 3 comments
Posted by Scott Laird
Thu, 17 Jun 2004 19:10:07 GMT
It looks like Tyan is getting into the server business. Their hot new 4-proc Opteron motherboard just started shipping, and they’re going down the same road as Supermicro, selling motherboards as well as bare-bones servers based on the motherboards.
By and large, this is a good thing–this class of system tends to be a bit better-integrated then the usual generic white box rackmount server. I’m still waiting for someone to put a decent managed power supply into a cheap server, though. Intel’s had specs for extending IPMI into power supplies for years, but no one really seems to care enough. Since power supplies are the flakiest component in most systems, anything that can be done to improve their reliability would be great. Even if it’s just giving a few hours’ warning before the power supply dies.
Posted in Computer Hardware | Tags amd, hardware, opteron, server, tyan | no comments
Posted by Scott Laird
Wed, 10 Sep 2003 18:25:58 GMT
I try to be pragmatic, but sometimes I just can’t help it and try to pick up lost causes. I think the world would be a better place if computers were easier to maintain. Fortunately, I’m a server person, and the server side of things is actually a lot easier then the desktop side, at least for now.
Warning: most of my experience is with Linux and Solaris boxes in ISP-like settings, although I’ve done a fair bit of time in small non-computer-related businesses and software houses. I have no idea how much of this applies to Windows.
I’ve been thinking about server management for years. Sometimes, I’ve been paid for it, sometimes (like now), I’m paid for other things. I still can’t stop thinking about it, though. There has to be a better way to manage servers then we’re doing now. As I mentioned yesterday, I think I might have a solution for at least a few common cases.
Traditionally, there are two models for server deployment. Either the heavyweight model (deploy a small number of servers and run lots of services on each) or the lightweight model (deploy a lot of servers, and run a small number of services on each). One of the problems is that, at least for small services, the heavyweight model seems cheaper. Why buy 10 servers that are going to sit 95% idle when you could buy 2 servers and have them be 75% idle? Or even one server that’ll only be 50% full. What happens pretty much every time is that a couple of the services start conflicting with each other somehow–one needs perl 5.6 for something, while another needs 5.8. Or they need two different versions of the JVM. Or one needs a critical security upgrade that ends up killing another service. So, you keep tweaking things, and you (barely) keep everything running, largely by avoiding making changes. Except, when you avoid making small changes, you inevitably miss little security fixes and little bug fixes, and you drift further and further from the mainline of whatever OS you’re running. So, inevitably, you reach the “server event horizon,” where things have grown so complex and unmanageable that the only thing you can do is buy 2-3 new computers to replace your one big system, and then slowly migrate services off of the old box onto the newer box. Except you end up with a lot of implicit assumptions lurking, assuming that DNS and DHCP are on the same server, or that Apache and Mysql are on the same box, and it takes forever to untangle them. Even once that’s done, you’ll find out that people have hard-coded server names into applications deployed all over the company, and you’ll end up spending 3 months untangling your one heavy-weight server that seemed like such a good way to save money at the time.
Conventional wisdom says that the way out of this problem is virtualization. Instead of buying 10 small computers, you buy one or two really big computers, and then partition them in software, and then install the software that you would have installed onto the little computers onto the partitions of the big computers. Lots of vendors love this model; IBM’s whole Linux-on-mainframes push is based on it. There are a couple problems with it as I see it, though. First, you’ll end up paying a ton of money for virtualization hardware or software–VMWare wants at least $2,500 per server for their PC-based virtualization code; pretty much everything else else is more expensive. Second, you’re still left with a bunch of small general-purpose servers that you need to manage individually, even if they do happen to physically reside within a single box. There are also reliability issues, but I’m going to ignore them for now; in my experience, even cheap PCs running Linux rarely crash, and when they do it’s usually a bad power supply, a bad hard drive, or bad RAM. Spending more money on hardware gives you multiple power supplies, better RAID, and more redundant memory. Plus buggy virtualization software, but we’ll come back to that, too.
Fortunately, the open-source world is making progress. User-mode-linux (UML) is making a lot of headway. It’s included in Linux 2.6, although it still needs a few little patches for optimum operation. It seems to have a 30% speed hit in a lot of cases; sometimes that’s a problem, sometimes it isn’t. Using it, you can build a big Linux host server, and then run a bunch of little virtualized servers on it for free. Sounds nice? Sort of–you still have to admin a ton of little general-purpose boxes, but at least you’ve mostly solved the dependancy problem that killed us a few paragraphs ago.
The nice thing about Linux is that it’s so flexible. Unlike every other OS that I’m aware of, there’s no one environment that is definitively Linux. Instead, we have a herd of Linux distributions, ranging from Red Hat to Debian to Gentoo to “Linux From Scratch” all the way down to the mini Linux distribution that wireless access point vendors call “firmware.” There’s no real reason for a special-purpose DNS server to run a full Linux distribution, except that it’s usually less work that way. However, once we have a UML-based virtualization scheme in place, it can actually be easier to use specialized distributions then general-purpose ones. I mean, the hard parts of a distribution are generally the installer, the hardware handling, and the update code. With a virtualized server, none of that applies. There is no hardware, really (it all pretends to be the same), the installer is really just a script that copies a hard disk image into place on the host system, and the update system is even easier–just save the data and completely discard the old OS image. In an ideal world, the OS image would be completely read-only, with configuration settings and data kept outside of the server in a standardized format. Then, software upgrades are truly trivial–kill off the old server VM and start up a new VM using the old data.
This won’t work for everything, of course. It’d be horrible for big database servers, or frankly big servers of any type. In my experience, though, there are a lot more small servers then there are big servers.
The other nice thing about this scheme is that the virtual server images are simple to build and easy to trade. They’re not utterly trivial, but it’s easier to build a server image then to actually write the server software or or maintain a full-sized OS distribution. Given a standardized interface between the host OS and the server image (things like IP address, DNS server, hostname, logging, and all of the other little details needed to make a server run), there’s no real reason that you can’t swap between server images from different “vendors”, grabbing whatever best serves your needs.
I’m starting to build a framework for this, more details as I have time to write them down.
Posted in Linux, LWVS | Tags management, management, server, server, servermanagement, servermanagement, sysadmin, sysadmin | 1 comment
Posted by Scott Laird
Wed, 10 Sep 2003 18:25:58 GMT
I try to be pragmatic, but sometimes I just can’t help it and try to pick up lost causes. I think the world would be a better place if computers were easier to maintain. Fortunately, I’m a server person, and the server side of things is actually a lot easier then the desktop side, at least for now.
Warning: most of my experience is with Linux and Solaris boxes in ISP-like settings, although I’ve done a fair bit of time in small non-computer-related businesses and software houses. I have no idea how much of this applies to Windows.
I’ve been thinking about server management for years. Sometimes, I’ve been paid for it, sometimes (like now), I’m paid for other things. I still can’t stop thinking about it, though. There has to be a better way to manage servers then we’re doing now. As I mentioned yesterday, I think I might have a solution for at least a few common cases.
Traditionally, there are two models for server deployment. Either the heavyweight model (deploy a small number of servers and run lots of services on each) or the lightweight model (deploy a lot of servers, and run a small number of services on each). One of the problems is that, at least for small services, the heavyweight model seems cheaper. Why buy 10 servers that are going to sit 95% idle when you could buy 2 servers and have them be 75% idle? Or even one server that’ll only be 50% full. What happens pretty much every time is that a couple of the services start conflicting with each other somehow–one needs perl 5.6 for something, while another needs 5.8. Or they need two different versions of the JVM. Or one needs a critical security upgrade that ends up killing another service. So, you keep tweaking things, and you (barely) keep everything running, largely by avoiding making changes. Except, when you avoid making small changes, you inevitably miss little security fixes and little bug fixes, and you drift further and further from the mainline of whatever OS you’re running. So, inevitably, you reach the “server event horizon,” where things have grown so complex and unmanageable that the only thing you can do is buy 2-3 new computers to replace your one big system, and then slowly migrate services off of the old box onto the newer box. Except you end up with a lot of implicit assumptions lurking, assuming that DNS and DHCP are on the same server, or that Apache and Mysql are on the same box, and it takes forever to untangle them. Even once that’s done, you’ll find out that people have hard-coded server names into applications deployed all over the company, and you’ll end up spending 3 months untangling your one heavy-weight server that seemed like such a good way to save money at the time.
Conventional wisdom says that the way out of this problem is virtualization. Instead of buying 10 small computers, you buy one or two really big computers, and then partition them in software, and then install the software that you would have installed onto the little computers onto the partitions of the big computers. Lots of vendors love this model; IBM’s whole Linux-on-mainframes push is based on it. There are a couple problems with it as I see it, though. First, you’ll end up paying a ton of money for virtualization hardware or software–VMWare wants at least $2,500 per server for their PC-based virtualization code; pretty much everything else else is more expensive. Second, you’re still left with a bunch of small general-purpose servers that you need to manage individually, even if they do happen to physically reside within a single box. There are also reliability issues, but I’m going to ignore them for now; in my experience, even cheap PCs running Linux rarely crash, and when they do it’s usually a bad power supply, a bad hard drive, or bad RAM. Spending more money on hardware gives you multiple power supplies, better RAID, and more redundant memory. Plus buggy virtualization software, but we’ll come back to that, too.
Fortunately, the open-source world is making progress. User-mode-linux (UML) is making a lot of headway. It’s included in Linux 2.6, although it still needs a few little patches for optimum operation. It seems to have a 30% speed hit in a lot of cases; sometimes that’s a problem, sometimes it isn’t. Using it, you can build a big Linux host server, and then run a bunch of little virtualized servers on it for free. Sounds nice? Sort of–you still have to admin a ton of little general-purpose boxes, but at least you’ve mostly solved the dependancy problem that killed us a few paragraphs ago.
The nice thing about Linux is that it’s so flexible. Unlike every other OS that I’m aware of, there’s no one environment that is definitively Linux. Instead, we have a herd of Linux distributions, ranging from Red Hat to Debian to Gentoo to “Linux From Scratch” all the way down to the mini Linux distribution that wireless access point vendors call “firmware.” There’s no real reason for a special-purpose DNS server to run a full Linux distribution, except that it’s usually less work that way. However, once we have a UML-based virtualization scheme in place, it can actually be easier to use specialized distributions then general-purpose ones. I mean, the hard parts of a distribution are generally the installer, the hardware handling, and the update code. With a virtualized server, none of that applies. There is no hardware, really (it all pretends to be the same), the installer is really just a script that copies a hard disk image into place on the host system, and the update system is even easier–just save the data and completely discard the old OS image. In an ideal world, the OS image would be completely read-only, with configuration settings and data kept outside of the server in a standardized format. Then, software upgrades are truly trivial–kill off the old server VM and start up a new VM using the old data.
This won’t work for everything, of course. It’d be horrible for big database servers, or frankly big servers of any type. In my experience, though, there are a lot more small servers then there are big servers.
The other nice thing about this scheme is that the virtual server images are simple to build and easy to trade. They’re not utterly trivial, but it’s easier to build a server image then to actually write the server software or or maintain a full-sized OS distribution. Given a standardized interface between the host OS and the server image (things like IP address, DNS server, hostname, logging, and all of the other little details needed to make a server run), there’s no real reason that you can’t swap between server images from different “vendors”, grabbing whatever best serves your needs.
I’m starting to build a framework for this, more details as I have time to write them down.
Posted in Linux, LWVS | Tags management, management, server, server, servermanagement, servermanagement, sysadmin, sysadmin | 1 comment
Posted by Scott Laird
Tue, 26 Aug 2003 03:00:21 GMT
This morning I received word that Internap, my former employer, will probably be open-sourcing the software that I spent most of my time maintaining while I was there. This is a Good Thing.
The Reference System was the software system that we used to manage the configuration of roughly 700 servers spread across the US, plus Tokyo, London, and Amsterdam. It took about 1/2 of a staff member to maintain the software on all 700 systems, including upgrades, security fixes, and so on. We could literally clone or reconstruct any server in the company within minutes, plus the time needed to restore data (but not software) from tape.
I’ve had a ridiculous number of former co-workers ask me for a copy of the source to the reference system. Once you get used to it, you don’t want to manage piles of servers without it.
Posted in Linux | Tags internap, management, referencesystem, server, servermanagement | 2 comments