Intel moving towards on-chip memory controllers (and the end of dual-CPU systems)

Posted by Scott Laird Thu, 16 Jun 2005 17:24:53 GMT

The Inquirer reports that Intel’s Tukwila chip is going to have an on-board memory controller, just like all of AMD’s newer chips. Tukwila is a multi-core Itanium, and is due sometime in 2007; the Inquirer suggests that Xeons will probably get on-board memory controllers in the same basic timeframe, simply because this will let Intel use the same controller chips for both Xeon and Itanium systems.

Assuming that the rumor is true (and considering how well AMD’s on-board controller works, I’d be surprised if it’s not), Intel will probably end up putting 4-6 FB-DIMM channels per CPU; since each channel’s good for around 10 GB/sec, a dual-chip system could potentially have 120 GB/sec in memory bandwidth. Even better, it’d be possible to build a high-capacity server with 48 DIMM sockets spread over the 12 channels; with 4 GB DIMMs, that’s 192 GB in a relatively simple box.

This assumes that multi-CPU systems remain common; given the way that multiple core systems are progressing, I’m not sure that there will really be a market for commodity multiple-CPU-chip systems after 2007 or so–if you can get 8 cores on a single chip, why would you pay the complexity cost of adding more chips, except for really high-end stuff? Even today, compare the cost and performance of an Athlon 64 x2 vs a system with 2 single-core Opteron 2xx chips–the Opteron system will have a bit more memory bandwidth, but they’ll have similar performance on a lot of workloads and an Athlon 64 x2 with cheap motherboard will be cheaper then most dual-CPU Opteron motherbards, never mind the CPUs.

Dual-CPU systems have been the bread and butter of the PC server world for the last 5-7 years, but I doubt that they have more then another two years to go before they fade into the sunset. Personally, I’d much rather manage a handful of single-chip 8-core clustered, virtualized (where virtual environments can migrate between physical systems under explicit admin control) systems then a smaller number of 2-4 CPU 16-32 core systems.

Posted in  | Tags , , , , ,  | no comments

The K9 does virtualization?

Posted by Scott Laird Sat, 13 Nov 2004 01:39:11 GMT

CNet has a summary of AMD’s latest analyst meeting, including details on future chips:

On the high end, AMD will release chips with two processing cores in 2005 and then follow in 2006 with chips based around a new chip core code-named Pacifica.

The company is relatively tight-lipped about Pacifica, but said it will be a dual-core chip that also contains virtualization technology–which allows a computer to run multiple OSes–and a security technology called Presidio. Pacifica will appear in desktops, notebooks and servers in 2006. AMD says it will also come out with a new ultra low-power chip for notebooks.

The article also mentions that Intel has been talking about virtualization lately. Considering that PCs have been fighting with virtualization since the 386 was first introduced, I’m amazed that it’s taken 20 years to get full-blown virtualization support in mainstream PC chips. Anyone who’s been around PCs for a while remembers the whole mess with 386–they could run multiple DOS programs at the same time, given a decent vm86 environment, like Desqview or Windows/386–but as soon as you tried to run something that needed 286- or 386-specific features, the whole house of cards came tumbling down, because the 386 couldn’t virtualize itself. Neither could the 486, Pentium, or any of the other x86 chips that have shown up since then.

Of course, we’re better at cheating now: programs like VMWare and Xen have shown that it isn’t really that hard to work around the CPU’s virtualization problems, but you end up paying a price. With VMWare, it’s performance; with Xen it’s patching the guest OSes to not use specific CPU instructions.

Even once that CPU’s been virtualized, the hardest part remains: virtualizing the rest of the machine. Xen’s approach is very open-source centric: they require the guest OS to be ported to Xen, including Xen-specific drivers, rather then emulate specific PC hardware in the virtual machine monitor. Long-term, that’s probably the most reasonable way to handle things, at least in the open-source world.

I’m looking forward to AMD’s new offerings. Pity we have to wait more then a year for them.

Posted in ,  | Tags , , ,  | 1 comment

Horus details

Posted by Scott Laird Wed, 01 Sep 2004 20:36:21 GMT

There’s been a fair bit of discussion online about ”Horus”, Newisys’s glue chip for building 8–32 processor Opteron systems. The Horus architecture glues 4-processor clumps of CPUs into a larger system. This lets them overcome the Opteron’s 8-processor scaling limit and build really big boxes.

Fortunately, one of Horus’s designers has posted some details to comp.arch. Here’s a quick summary:

  • Each Horus chip has 7 links; 1 to each local CPU and 3 to other Horus chips.
  • The inter-Horus link is a proprietary version of HyperTransport, modified to work better over cables.
  • While they can scale to 32 processors, going past 16 costs extra latency, because Horus only has 3 inter-Horus links. The sweet spot is 8–16 CPUs.
  • Their reference design uses 2 HT links per quad for I/O. This means that some intra-quad IPC has to go through an intermediary.
  • They designed for a NUMA factor of 3–remote memory costs 3x what local memory costs.
  • The Horus architecture can support up to 64 MB of “remote cache.” It’s still unclear if that’s 64 MB total spread across 8 Horus chips, 64 MB per chip, or 64 MB on a dedicated “Horus Cache Chip” that replaces a CPU quad in the design.
  • The inter-Horus links can be reconfigured and reset on the fly; this will allow for hot-swapping and partitioning.

We can expect to see Horus systems show up late next year. Since AMD’s dual-core Opterons are due in about the same timeframe, we should see some fascinatingly huge PCs by Christmas 2004. The basic design should be similar to Sun’s Enterprise Server [3456]xxx systems–a bunch of plug-in slots that take 4 CPUs and a bunch of RAM each. Since Horus clearly wants I/O to be local to each quad, it’s unclear exactly how networking and disk I/O will work–will there be FC and GigE controllers on each quad, and the system routes them through to the backplane? Will each card have I/O on the front panel, even though that makes swapping CPU cards a royal pain? Will the manufacturers include an “I/O node” on the motherboard with its own Horus and a bunch of HT-to-PCI-X bridge chips? Hopefully, we’ll see a few of each design and the market can sort it out.

Posted in  | Tags , , ,  | no comments

Tyan's getting into the server business

Posted by Scott Laird Thu, 17 Jun 2004 19:10:07 GMT

It looks like Tyan is getting into the server business. Their hot new 4-proc Opteron motherboard just started shipping, and they’re going down the same road as Supermicro, selling motherboards as well as bare-bones servers based on the motherboards.

By and large, this is a good thing–this class of system tends to be a bit better-integrated then the usual generic white box rackmount server. I’m still waiting for someone to put a decent managed power supply into a cheap server, though. Intel’s had specs for extending IPMI into power supplies for years, but no one really seems to care enough. Since power supplies are the flakiest component in most systems, anything that can be done to improve their reliability would be great. Even if it’s just giving a few hours’ warning before the power supply dies.

Posted in  | Tags , , , ,  | no comments