by Joe Barr Originally published in May, 1996 SMP 1. Symmetrical Multi-Processing. The ability to distribute workload evenly among two or more CPUs. 2. The easiest way to dramatically boost system horsepower. When DFI asked if I was interested in evaluating one of their dual-Pentium machines I suddenly knew there was an intelligent power governing the universe. It was like asking a crack addict if he would like to sample a few rocks. Of course, when they've called in the days since then to ask when I'll be finished with my evaluation, I come down with a bad case of chronic vague. There are just so many possibilities to 'evaluate' with this machine that it's hard to say when I'll have had enough time to try them all. The thing that sets this machine apart from your typical 133-mhz Pentium box is, of course, the fact that it has two 133-mhz Pentiums. To take advantage of both processors, however, you need an operating system written to the Intel SMP specification. At this writing there are several choices for desktop and workstation machines: OS/2, Windows NT, and a couple of flavors of Unix, including the suddenly very popular Linux. At the NOS level, you can add Novell - although it seems to be a hybrid mix of both symmetric and asymmetric design. In asymmetric multiprocessing, certain tasks are assigned to specific processors. One CPU might be dedicated to handling I/O for the entire system, and another exclusively reserved for something else. Symmetrical multiprocessing is a democratic architecture: work is assigned based only on availability, not by playing favorites. Advanced Programmable Interrupt Controllers (APIC's) help level the workload. One APIC sits between the CPU's and the mainboard. It handles all interrupts from the bus. Each CPU also has its own on-board APIC to handle interrupts between it and the other CPU(s). Cyrix and AMD Pentium equivalents do not have the local APIC, and that's why you can't use them in building an SMP machine built to the Intel spec. The I/O APIC acts like a traffic cop directing traffic. Traffic in this case means interrupts from devices needing attention. Your modem, for example, generates an interrupt request when it has received a byte of data from your download of a file from alt.binary.furry.critters. We're talking IRQ's here. You know, those things that drive you crazy when you assign the same one to more than one device. Whether it's the modem with a byte of data or the printer or the keyboard, each interrupt means that one of them needs the CPU to complete a task. As CPU resources are needed by your application programs, they are dispatched in a similar manner. It gets a little more complicated for the operating system, but the applications don't care. Physical memory is shared by all processors and they all address it the same way. This means that when CPU 1 is told to fetch a value from an address it gets the same value that CPU 2 would get, or CPU 3, or any other. Simple, isn't it? Actually, handling updates to cache memory and dispatching tasks by priority, even interrupting some to run others, add enough complexity to keep things interesting for systems programmers. Application programmers don't have to care. As long as the operating system works as it should on an SMP box, the applications will see the benefit. The DFI evaluation unit (the "Double Shot") was priced at over $5,000.00 retail when I received it. It is based on the DFI G586VPM system board and a pair of 133-mhz Pentium processors. Dropping prices for Pentium CPU's and memory have brought the price down considerably. The Double Shot is packaged in an attractive tower case with 32 meg of EDO RAM and 512 kb of shared L2 cache. The system has a PCI/ISA mainboard and has server written all over it. In addition to the 1.6 gigabyte Western Digital IDE drive, 3.5 inch floppy, and 6x CD-ROM, it comes with five empty bays and a 250 watt power supply. The construction reminds me of Hewlett-Packard boxes: heavy and strong. The outer shell comes off in pieces. Top and/or sides can be removed to expose whatever area you need access to. It is roomy enough inside for even a ham-handed dweeb like myself to be comfortable working in. The floppy and the hard drive sit on top of the inner casing but under the removable top, where they don't take up any of the bays inside the casing. The CD-ROM looks lonely as it is the only storage device inside the case of the evaluation unit. My only complaint with the physical makeup and layout is that the power switch is placed too near the floppy and CD drive. A lot of finger-fumbling takes place in front of those devices while media is being inserted and removed and when you reach down to hit the eject button. It is inviting disaster (especially on a server) to put a protruding On/Off switch in a hot zone like that. Do you think 32 meg of RAM might not enough? You can grow it to 512 meg and add another 512kb of burst-mode static RAM for a total of 1 meg of L2 cache. The mainboard also incorporates a pair of PCI IDE "controllers," two 16550A compatible UARTS, and PS/2 style mouse and keyboard connectors. The Intel SMP specification and the hardware won't do a thing for you unless your operating system is written to take advantage of them. The DFI box came with OS/2 2.11 SMP installed. It's also available with SCO Unix and Windows NT. I hope to be able to try some other SMP-enabled operating systems on this beast before returning it: NT, Linux, and Merlin. Actually, this column was delayed a month or two in hopes that the new version of SMP OS/2 would be available for review. No such luck, though it is due to go to beta test in the very near future. I would have already tried NT, but for hearing too many horror stories of NT insisting on having its way with the C partition. I don't have an install CD or diskettes for OS/2 2.11 SMP, so if it got trashed in the process of installing NT it would simply be lost. Ditto for the SMP version of Linux. I have installed Warp Server in a second partition, however, and it seems very happy on the Double-Shot even though it sees only one of the Pentiums. When you boot OS/2 2.11 SMP for the first time on the Double-Shot you see a strange message as the operating system begins to load: "Processors initialized: 2." Other than that the load is normal, except for the fact that it completes much sooner than you've ever seen before. After it's loaded and you are face to face with the interface, it looks a little clunky and old-fashioned. Warp's pretty face is missing and that's the first thing you notice. Looking closer, you notice a new icon in the OS/2 System Productivity folder called the SMP Monitor. Double-click on it and you get a moving graph of processor utilization: processor one in green and processor two in blue. That's fun to watch, but the real utility of the SMP Monitor is the ability to change the status of each processor from ONLINE to OFFLINE. This tool allowed me to see easily test the boost SMP gave OS/2 in running applications. Although servers are mentioned most often as likely candidates for SMP platforms, they may not be the best choice for the technology. Most servers are I/O constrained, not strapped for CPU cycles. Application or database servers are more likely to benefit from the added horsepower than the typical LAN file and print server. Recent tests by PC Week (see the April 1st issue for an article called "State of The Server") show that a single processor version of IBM's Warp Server outperforms a four-CPU powered edition of Windows NT. With the growing popularity of multitasking operating systems (even Windows users can print a document and actually do something else at the same time these days), the amount of time applications spend waiting for the CPU has grown accordingly. In the good old days, when DOS and Windows didn't really try to do multitasking, adding additional processors wouldn't have accomplished much. By extension, you can also conclude that if your current system is not spending much time waiting for the CPU then going to an SMP system won't be worth the expense. Warp Server outperforms Windows NT as a file and print server when both are on uniprocessor boxes. Logic dictates that on a four processor system NT performance would surpass that of OS/2 on the single CPU Box. But that's not the case according to recent PC Week Lab tests. Adding horsepower does not automatically give you better traction. What types of applications would benefit the most from extra CPU's? Graphics, CAD/CAM, number-crunchers, compilers and games probably top the list, with database engines following closely. As the workload leans more towards I/O than CPU cycles, the gains to be had in an SMP environment decrease. Remember the example I gave earlier of NT on four processors and OS/2 on 1? That's not so much an NT or SMP problem, that's a workload that's I/O constrained. There is another answer to the question of which applications benefit most from SMP. It's the short answer. All of them and the more the better. I'm talking about the type of multitasking that power users are doing these days, where you have a LAN connection, an online connection, and a wordprocessor or editor going while you're listening to alternative rock on the CD and compiling C++ code in the background. Multitasking eats CPU cycles. SMP provides CPU cycles. Applications don't have to be aware of multiple processors in order to benefit, but they can be written to be aware and to get extra performance boosts because of it. The application that stands out above all the rest in this regard is ColorWorks/2. It loves those extra horses more, and makes better use of them, than anything else around except perhaps D. Wayne Lukas. I was a little puzzled by the need to select the number of CPU's and frankly concerned that it would turn into a crap-shoot trying to find the optimum number. Not to worry. It's not there for tuning the application for greater power, that comes from its own smart-threading. It's there to tune down the system resources the program uses so that your other applications get a few cycles now and then too. ColorWorks is kick-ass. Is there SMP in your future? Yes, probably, whether you see it there yet or not. Yes for several reasons. One, it's easier to double system power by adding another piece of current technology than it is to develop a new piece that's twice as powerful. Two, it's an Intel specification and the most important discriminator they have going for them at present as they try to keep the likes of Cyrix, IBM, and AMD at bay. Three, it is a natural evolution of computer systems as we have already witnessed in the mainframe world. The quad-Pentium Pro systems will be here in a few months and they will sizzle compared to those pokey little 200mhz Pentium Pro boxes you might be drooling over today. SMP is going to become a larger and large slice of the desktop world.