White Papers Surfing the web and Multi-Threaded I/O Surfing The Web and Multi-threaded I/O. This document describes the hard disk I/O activities generated by surfing the web and why a low-cost but high-performance SCSI adapter improves the response time in surfing the web. What the reader learns from this document is why even the desktop users who surf the internet can use a SCSI device. AdvanSys makes the SCSI adapter affordable for everyone. The I/O activities in a PC After powering on a PC with the Windows95/98 or NT operating system, one literally waits several minutes listening to the hard disk churning away before the PC finally becomes ready. This is because the operating system must load itself from a hard disk and go through the StartUp Menu to start additional tasks from the hard disk(s). As a result, there are many active tasks started automatically whenever the PC is turned on. By pressing the CTRL-ALT-DEL keys, one can easily verify all the outstanding tasks started by Windows 95 and 98 even before the first clicking of the mouse to start any other tasks at all. Most novice users and even some computer professionals still argue that the desktop PC users do not have a multitasking environment. Nothing can be further from the truth. A video capture data card watches the incoming video or sound data continuously even if a user did not start any video capturing task. When a user installs a video capture card, his Startup Menu was modified to start the watch program for video and sound data. Most users also are not aware that an operating system must wake itself up occasionally to check if a user has inserted a new CD in the CD-ROM. The checking is yet another task not started by the user. For all those outstanding tasks each one will go to a disk to save and retrieve some data. After all, the hard disk is where the data reside. The tasks running in a PC are designed to process data. Going to hard disks slows the PC down. This is why all operating systems support a huge memory cache which saves the "most recently used" data so they can be retrieved from the memory instead of disk. Everyone understands that retrieving data from memory is much faster than from the disk. These cached data must be flushed back to the hard disk when the PC is turned off. This is why one can not turn off the PC without properly shutting down Windows95 or 98. This is also why buying more memory for the PC still is the best investment to make the PC run faster. However, having more memory does not solve all the I/O problem. With the cache and the outstanding tasks, there is simply not enough memory to keep all the outstanding tasks and their data. In addition to the memory cache, all operating systems need a swap file on the hard disk to swap in and out any inactive tasks or "least recently used" data. When a user is typing away in his word processor, he can hear the background disk drive noise even he does not save a file. The noise is the swap file activities. The swap file on a hard disk is the most frequently visited place when the PC is turned on. Having a fast disk is the second best investment next to buying more memory. As we should learn later, having another disk is also a very good investment for improving the PC's performance. Multitasking For a Desktop User After turning on the PC, lets double click our dial-up-network icon to connect the modem to an Internet Service Provider (ISP). Double click the Netscape Communicator icon to start surfing. Of course, the first thing we do is check our email by starting the Netscape Messenger. One of the incoming email has an attached Word file. By clicking the file, we have now started the Word97. Assuming we want to research a topic on the web, we start a Netscape Composer for notes taking. Next, to decide where to put my file, I start Windows Explorer -- the old File Manager of Windows 3.1 -- to browse my hard disk. Suddenly, in addition to all the tasks started by Windows and the Startup Menu, I have started the following tasks: 1.Dial-up-network for modem 2.Netscape Communicator 3.Netscape Messenger 4.Word97 5.Netscape Composer 6.Windows Explorer Now, I have proven that nobody is running in a single tasking environment on a desktop PC anymore. Some computer professionals still argue that a desktop user only interacts with one Window at a time. Again, lets prove this is untrue. Of course, for most engineers and programmers, we are very much used to running simulation or compilation in one window, start email or editor in a second window, and start another video game in the third window, especially when the boss is not watching. To research a topic, we have a list of sites to visit from my web search engine. A DVD FAQ with a size of 200K bytes is very interesting. We click on the link. On our 28.8K modem and a good day on the internet, the average data rate for getting the FAQ is 3K bytes per second. The 200K file takes 70 seconds to retrieve. This is a long time for a user with an active mind. On a bad day on the internet -- a very busy ISP or telephone backbone -- the average data rate can slow down to below 1K bytes per second. The time to retrieve this page is now 200 seconds. If we don't wish to stare at the monitor and wait, there are two choices: click the stop icon and forget this large file or switch to another windowand do something else. Well, the FAQ is important for the research, so we move the mouse to Netscape Composer to take down notes or do some cut-and-paste. We can also go to Netscape Messenger to see if there is any new email. For those of us receiving over 100 emails a day, going to Netscape Messenger often becomes a second nature to us. Retrieving a large file and waiting for a long time even on a good day on the internet is especially true today when many sites wish to impress us with fancy graphics, video, and audio data. No wonder everyone wants 56K modem or ISDN, if he can afford it. Instead of staring the monitor and waiting, most of us with an active mind will move to another window. This is especially true when we are doing research by surfing the Internet. The waiting is the best time for cut-and-paste the web-page links and writing down our thoughts. It is not unusual to start a printing task or scan in a picture as a part of the research report. For those people who argue that ordinary folks don't do multitasking therefore don't need a faster computer, it is self fulfilling. People can't do multitasking if their PC's won't run very fast in a second or third window. Now I have proven my case that ordinary people can do multitasking if they have a multitasking PC. When we move to another window from the web browser, the response time of the mouse and in the other window is just as important as the speed of retrieving data from the internet. Now we are ready to discuss why most PC's today are not designed for multitasking and how do we fix it with a SCSI adapter. Data Bandwidth Problem on a PC Until now, the PC industry has always brought faster CPU's to make the program run faster. Everyone understands that the workstations and mainframes have better I/O technology because they support multitasking. Why do the PC's need more I/O technology? How do the current PC's handle more I/O activities? Like the Windows95/98 or NT, the web browser has its own memory and hard disk caching function. It only makes sense that after spending 70 seconds to retrieve the 200K bytes of the DVD FAQ page we will save it in memory just in case the user goes back to the same page. Of course, when we run out of memory space, like the swap file, the web browser caches the data on a disk. It is much faster to retrieve the same page from a hard disk instead of from the internet again. By the way, the browser tracks many other things on the disk too. Like the swap file, the disk cache file is another very busy place on the PC. For those of us who move to the Netscape Composer while the browser retrieves a page, there is a third I/O activity. The composer needs to write disk to save our research data. If we print a web page, the data is first copied to a spool file. The Print Manager reads the data back from the spool file and sends it to the printer. If we start a scanner, the image data is written to file too. Suddenly, a novice user not only has started many tasks unknown to him, he also has the following I/O activities: 1.OS swap activities 2.Dial-up modem activities 3.Web browser cache activities 4.Word processor save activities 5.Printer spool file or scanner save activities Most PC's today come with one EIDE hard disk and one CD-ROM connecting to a separate ATA connector. The biggest problem for an EIDE connection is that it only performs one I/O request at a time. Therefore, we put the hard disk and CD-ROM on separate connections so they don't interfere with each other. We can't afford to have the hard disk idle when the CD-ROM must spend several hundred milliseconds to move the access arm. Using two EIDE connectors is a poor man's method of doing minimum concurrent I/O activities without a multitasking SCSI adapter. There is another big problem. While the 33 MHz EIDE interface in theory can transmit 33 MB per second, most EIDE disk drives today can only sustain at most 14 megabytes of data per second continuously, the maximum speed of data coming off the fastest hard disk media. In fact, the biggest lie of the PC industry is trying to make the novice users believe that an EIDE hard disk can transmit 33 megabytes per second of data. Add insult to injury, when there is a lot arm movements on the disk drive, the average data rate from the disk drive slows down to just one to two megabytes per second. This is exactly what we do when we have many I/O activities happening on a single disk. If one does the CD mastering by reading from a hard disk continuously and writing to a CD-R, he adds two more streams of I/O activities to his computer. The EIDE disks on one cable can only do one disk I/O task at a time. Therefore, all those disk activities are done in serial. In this configuration we are sharing all system swap, netscape caching, and CD mastering disk I/O's on the same hard disk. Everything is done in serial. No wonder all CD-R manufacturers carefully ask you to shut down all other tasks while mastering a CD to ensure the system disk does not become too busy. If it becomes too busy, you have a good chance of underruning the writing of the CD-R which is a very nasty thing. The second biggest lie of the PC industry is to make the users believe that an EIDE ultra DMA drive actually performs bus master DMA function, which allows the CPU to be freed to run the CD mastering software while reading from hard disk and writing to CD-ROM. The reality is most PC's with ultra DMA drives run at programmed IO (PIO) mode, i.e. while reading from hard disk and writing to CD-ROM the CPU moves the device data manually and is unavailable to run the CD mastering software. The PC manufacturers need to install the bus master driver to enable DMA. It is not installed because of compatibility issues, i.e. not all EIDE devices will run correctly under DMA. Not having bus master DMA plus doing all the I/O activities in serial, most PC's in the market today are not designed to move disk drive data efficiently. The users are forced to accept slower performance when he is surfing on the internet or doing CD mastering. Affordable SCSI Solution There is really no debate why SCSI is better than EIDE because all high-end workstation users and file servers are using SCSI. The only question is should the desktop users who like to surf on the internet and work with several windows use SCSI devices? The answer is a definite yes because a SCSI adapter allows multiple devices running concurrently and those who surf the internet need more than one disk drive. For the same reason as file servers and high-end workstations with multiple disk devices, a desktop user needs more than one disk drive to allow fast access to the swap file, the Netscape page cache file, and the printer spool file. This is because they want to avoid the arm-stealing effect of a disk drive. Say, the system swap file is on the inside of a disk and the Netscape page cache file is on the outside. When both files are accessed at the same time, the disk drive must move it's arm between the inside and the outside of a disk. When they are on different disk drives, the disk arms for each drive stay at the same place and no time is wasted on arm-stealing. If we have three active tasks, keep their files on three separate disks. If one is surfing the internet and opens multiple windows, he needs multiple disks to ensure the optimum performance by putting the swap file, Netscape cache file, and the printer spool file one separate disks. We will complete this paper by examining the reasons why SCSI is a better solution other than having an affordable price from AdvanSys. 1.Multiple tasks and concurrent accessing on multiple SCSI devices 2.Bus master to free up the CPU for other tasks 3.Easy installation When a hard disk and a CD-ROM drive is sharing a SCSI connection, the hard disk is not blocked out by the slow arm movement of the CD-ROM. This is because the CD-ROM drive will stay off the SCSI bus while it is moving the arm. Whenever there are multiple disks or CD devices, SCSI attachment is only busy when the device is ready to transfer data. The average data transfer from a hard disk is about a few megabytes per second. You only need an ultra SCSI adapter of 20 megabytes per second to connect to two or three disk drives. When the instant disk data rate of 15 megabytes per second is very important to your video data application, an ultra wide SCSI adapter of 40 megabytes per second becomes necessary. The bus master function of a SCSI adapter frees up the CPU to perform other functions like decoding the MPEG data, capturing the JPEG data, or mastering a CD. Even for a very low cost SCSI adapter from AdvanSys, only 8% of CPU cycles are needed to copy a large file. 90% of CPU cycles are needed if you copy a file from the ultra DMA EIDE hard disk. Just remember that most PC's do not have the bus master driver for the ultra DMA EIDE disks. When the CPU is very busy, moving the mouse from one window to another becomes very sluggish not to mention the data underrun of writing the CD-R device, scratchy noises from the sound card, or skippy images from video playing. Last, but not the least, installing the SCSI is much easier than installing an EIDE device or a parallel port device. Each EIDE cable connects up to two devices: one master and one slave. Some EIDE devices require a cable select (CS) jumper. One must turn off the computer to install the EIDE device. Most users are pleasantly surprised that they can connect a ZIP device to an external connector of a SCSI adapter and simply "refresh" the Windows95 or 98. Magically, the ZIP device becomes visible to the system. Parallel port device requires one to daisy chain the connection, load another driver for the same parallel port, and restart the computer. AdvanSys provides SuperSCSI, a one-step install program, making the initial installation of the adapter a matter of just plugging in the adapter and connecting the device. The user does not have to answer any questions. Using SuperSCSI, one OEM customer reported that the service phone calls was only 1% and the retail channel return rate was only 3% which is a far cry below the 8% retail return rate of a similar EIDE device.