Linux Today: Linux News On Internet Time.

Quad Xeon Processors Running NT Are A Weak Value Proposition

May 06, 1999, 15:56 (57 Talkback[s])
(Other stories by Scott Marlowe)

by Scott Marlowe

Recently, an independent testing laboratory found NT on a 4 way Xeon processor machine to be faster than Linux on the same hardware. While this tidbit is interesting for many reasons, not the least of which is the questionability of the testing procedures and the name calling and mud slinging that has ensued, most people have missed the really important point, which is: Quad Xeon processor boxes are a very weak value proposition for Web servers.

Let's do a quick case study of the price / performance ratio of a quad Xeon processor box versus a load balanced server farm suitable for either SMB file serving or Web serving. The extra details pertaining to network configuration will not be covered in this document, only the servers themselves.

First, the prices for hardware I will be quoting will be from May 6, 1999. I will use Dell as the standard supplier, although any supplier will likely give similar prices for the type of machines we are looking for.

Build a solid, reliable server for enterprise class use. Uptime should be as high as possible, preferably 100%. Cost should be kept low, but performance should not suffer simply to save money.

The server(s) should be able to provide 100Mbit/sec or better throughput under load as either a file server or as a static content HTML server. It should house 25 Gigabytes of online storage in a RAID or redundant configuration.

While the quad Xeon machine can meet and exceed most of these goals, it does so at tremendous expense. The cost of a machine of this caliber runs from between $25,000 and $50,000, depending on what options you choose.

That's a big chunk of change no matter how you look at it.

Let's compare that to a farm of Linux (or FreeBSD/NetBSD/OpenBSD) boxes running under load balanced switches. Since a web serving farm scales at a nearly linear rate, there is no sense in buying the fastest machines made. What we want to shoot for is the most bandwidth per buck in each unit. For a Linux farm, dual CPU machines represent a fairly good trade off of price and performance.

We basically want to maximize performance and minimize cost. Since licensing isn't an issue with Linux or FreeBSD, and the energy required for each machine is fairly low, we probably have to worry more about shelf space than anything else.

From Dell, a Dual Pentium III 450 with 256 Megs of RAM, dual 100BastTX NICs, low end video card, and four 9 Gig Ultra Wide hard drives sells for $4636.

Four of these machines will cost us $18,544 plus shipping. We'll call it $20,000 for four machines. Note that we will have a total of eight 450 MHz CPUs, with a total of 1 Gig of RAM between them. Note that if you don't already have a load balancing switch to use, you may need to buy another machine for $3,000 for a machine to balance to the loads. While this machine will not need the large RAID array of the of the farm servers, it will need plenty of memory (hey, if you're building a balancer, you should put squid on it too) and four or more NICs, and probably a FDDI card as well. That will bring our bill to just under the cost of the lower end Quad Xeon machines.

Comparison Chart Quad Xeon Farm
Cost: $25,000 (to $50,000) $24,000
Operating System Windows NT Linux
# of CPUs: 4 x 500MHz (1 Meg cache) 8 x 450MHz (512k cache)
RAM 1 gig 1 gig
Aggregate network bandwidth 400Mb/s 800Mb/s
Drive storage (RAID 5) 27 gigs TOTAL hardware RAID 27 gigs software RAID / machine

Even at the lowest cost of $25,000, the Xeon machine is still running half as many CPUs, and has fairly poor fault tolerance.

Price / Performance evaluation
While the quad Xeon machine may be as fast as the Linux farm, the cost of the operating system ($799 more) and lack of redundancy make it a weak value proposition. Also, its poor options for upgrades make it a dead end system.

Failure Mode Analysis
We now look at the common types of failures for servers, and speculate about how these failures would affect each type of web server environment.

Failure type Effect on quad Xeon Effect on Farm
Hardware failure, soft System slows down as a whole.

Depending on the failure you may have to take down your server and replace the failed component (i.e. NIC failure, single hard drive failure, single power supply failure.)

Performance loss could be anywhere 25% if a NIC fails to 70% or more if a hard drive fails.

System slows down as a whole

One of the four servers may be significantly slowed down, and may need to be taken down for a few hours to be fixed. However, the rest of the farm stays up.

A single machine failing may result in 25% performance loss maximum. Other minor failures (one NIC failing, one hard drive failing) could result in 8 to 12% performance degradation.

Hardware failure, hard Systems stops.

You must repair the system to bring it back online. At best you may be able to restore partial operability after removing the failed component and restarting without it.

Performance loss is 100% until the server is fixed.

System slows by 25%

You must repair the one bad server and bring it back online. If a component is not available, but the server can still operate in a degraded manner, you may be able to reinsert it into the farm until you can get the component. Note that high end workstations could be tasked to take over this job until the parts arrive.

System stays up in a slightly degraded manner.

Software failure, soft System stops temporarily. Users must wait while Server or a service restarts. 25% performance degradation. System back at 100% when the failed server is restarted.
Software failure, hard System stops. System Administrator must be called on to bring system back up. Depending on damage, this may take several hours. 25% performance degradation. System administrator must be called on to bring system back up. Depending on damage, this may take several hours.

Failure types:

  • Hardware, soft: Single item fails, but does not shut down server. System can be reconfigured on the fly to overcome these problems. Example: Failed NIC or hard drive in a RAID
  • Hardware, hard: Single or multiple items fail. Results in server shutdown of the unit affected. Example: CPU locks up memory bus, server catches on fire, power spike kills both power supplies in the Xeon. etc...
  • Software, soft: Server or a service on it crashes. Must reboot server or restart service.
  • Software, hard: Server or service software becomes heavily corrupted. Requires OS and / or service to be completely reinstalled and tested before being placed back online.

So, imagine you've got one of these two setups running, and they are working fine. Average CPU loads are sitting at below 50%, and the customers are happy. Then, you get a large contract or you start to advertise. Suddenly, your CPUs are averaging 80% during the day, peaking at 100%, and you are getting error_logs full of messages about time-outs.

It's time to upgrade. How do you upgrade the quad Xeon machine? You don't, really. You replace it. At $25,000 it's hard to convince to boss to replace a server that's less then one year old, when the new server will only buy you another 6 months at best.

With the farm, you just buy new machines as you need them. And, as faster machines come in, you can add them to a cluster one at a time as you need them. Since they are standard dual processor workstations, they can arrive much faster as well, days instead of weeks. As long as you don't outrun your network connection speed, you can keep increasing your farm size as needed.

Well, that's my take on it, what's yours?

Scott Marlowe is from Jacksonville, Florida and currently resides in Englewood, Colorado. Scott is a curricululm developer, Web author, and system engineer at a medium sized Internet oriented company. He is the father of two, an amateur saxophonist and a Linux enthusiast.