(This document is part of the PC-Clone Unix Hardware Buyer's Guide. The Guide is maintained by Eric S. Raymond ; please email comments and corrections to him.)

UP -- Return to the contents page
NEXT -- Go to the next section.
BACK -- Go to the previous section.


Disk Wars: IDE vs. SCSI

Overview

Another basic decision is IDE vs. SCSI. Either kind of disk costs about the same, but the premium for a SCSI card varies all over the lot, partly because of price differences between VLB and PCI SCSI cards and especially because many motherboard vendors bundle an IDE chipset right on the system board. SCSI gives you better speed and throughput and loads the processor less, a win for larger disks and an especially significant consideration in a multi-user environment; also it's more expandable.

In terms of pure disk speed, IDE will always be faster, as they use the same underlying disks, and IDE has less overhead. As fast as disks are getting today, the difference is effectively noise. The real advantage of SCSI comes from its extra brains. IDE uses polled I/O, which means that when you are accessing the disk, the CPU isn't doing anything else. Most SCSI systems, on the other hand, are DMA based, freeing up the system to do other things at the same time. Hence, in terms of full system performance, SCSI is indeed faster if you have good hardware and an intelligent OS.

Another important win for SCSI is that it handles multiple devices much more efficiently. You can have at most two IDE devices; four for EIDE. SCSI permits up to 7 (15 for Wide SCSI).

If you have two IDE (or ST506 or ESDI) drives, only one can transfer between memory and disk at once. In fact, you have to program them at such a low level that one drive might actually be blocked from seeking while you're talking to the other drive. SCSI drives are mostly autonomous and can do everything at once; and current SCSI drives are not quite fast enough to flood more than half the SCSI bus bandwidth, so you can have at least two drives on a single bus pumping full speed without using it up. In reality, you don't keep drives running full speed all the time, so you should be able to have 3-4 drives on a bus before you really start feeling bandwidth crunch.

Of course, IDE is cheaper. Many motherboards have IDE right on board now; if not, you'll pay maybe $15 for an IDE adapter board, as opposed to $200+ for the leading SCSI controller. Also, the cheap SCSI cabling most vendors ship can be flaky. You have to use expensive high-class cables for consistently good results. See Mark Sutton's horror story.

Enhanced IDE

Some souped-up IDE variants have recently hit the streets. These are "Enhanced IDE" (E-IDE) and "Fast AT Attachment" (usually ATA for short). ATA is Seagate's subset of E-IDE, excluding some features designed to permit chaining with CD-ROMs and tape drives using the new "ATAPI" interface (an E-IDE extension; so far only the CD-ROMs exist); in practice, ATA and E-IDE are identical.

You'll need to be careful about chaining in CD-ROMs and tape drives when using IDE/ATA. The IDE bus sends all commands to all disks; they're supposed to latch, and each drive then checks to see whether it is the intended target. The problem is that badly-written drivers for CD-ROMs and tapes can collide with the disk command set. It takes expertise to match these peripherals.

Neither ATA nor E-IDE has the sustained throughput capacity of SCSI (they're not designed to) but they are 60-90% faster than plain old IDE. E-IDE's new ``mode 3'' boosts the IDE transfer rate from IDE's 3.3MB/sec to 13.3MB/sec. The new interface supports up to 4 drives of up to 8.4 gigabytes capacity.

E-IDE and ATA are advertised as being completely compatible with old IDE. Theoretically, you can mix IDE, E-IDE and ATA drives and controllers any way you like, and the worst result you'll get is conventional IDE performance if the enhancements don't match up (the controller picks the lowest latch speed). In practice, some IDE controllers (notably the BusLogic) choke on enhanced IDE.

Accordingly, I recommend against trying to mix device types an an E-IDE/ATA bus. Unfortunately, this removes much of E-IDE/ATA's usefulness!

E-IDE on drives above 540MB does automatic block mapping to fool the BIOS about the drive geometry (avoiding limits in the BIOS type tables). They don't require special Unix drivers.

Many motherboards now support ``dual EIDE'' channels, i.e. two separate [E]IDE interfaces each of which can, theoretically, support two IDE disks or ATA-style devices.

SCSI Terminology

The following, by Ashok Singhal of Sun Microsystems with additions by your humble editor, is a valiant attempt to demystify SCSI terminology.

The terms ``SCSI'', ``SCSI-2'', and ``SCSI-3'' refer to three different specifications. Each specification has a number of options. Many of these options are independent of each other. I like to think of the main options (there are others that I'll skip over because I don't know enough about them to talk about them on the net) by classifying them into five categories:

  1. Logical: SCSI-1, SCSI-2, SCSI-3

    This refers to the commands that the controllers understand. Shortly after SCSI first came out, the vendors agreed on a spec for a common comand set called CCS. CCS was made a required part of the SCSI-2 standard. You should be able to use a SCSI disk with a SCSI-2 card and vice-versa as long as they both support CCS. Non-CCS SCSI devices aren't worth considering.

    ``SCSI-3'' is a superset of SCSI-2 including commands intended for CD-R and streaming multimedia devices.

  2. Electrical Interface

    This option is independent of command set, speed, and path width. Differential is less common but allows better noise immunity and longer cables. It's rare in SCSI-1 controllers.

    For a PC you will probably always see single-ended SCSI controllers but if you're shopping around for disks you might run across differential disks. They will likely be more expensive than single-ended ones and will not work on your single-ended bus.

  3. Handshake

    Synchronous is faster. This mode is negotiated between controller and device; modes may be mixed on the same bus. This is independent of command set, data width, and electrical interface..

  4. Synchronous Speed (does not apply for asynchronous option)

    Normal transfer speed is 5 megabytes/sec. The ``fast'' option (10 mb/sec) is defined only in SCSI-2 and SCSI-3. Fast-20 (or ``Ultra'') is 20 mb/sec; Fast-40 (or "Ultra-2") is 40MB/sec. The fast options basically defines shorter timing parameters such as the assertion period and hold time.

    The parameters of the synchronous transfer are negotiated between each target and initiator so different speed transfers can occur over the same bus.

  5. Path width

    The standard SCSI data path is 8 bits wide. The ``wide'' option exploits a 16- or 32-bit data path (uses 68-pin rather than 50-pin data cables). You also get 4-bit rather than 3-bit device IDs, so you can have up to 16 devices. The wide option doubles or quadruples your transfer rate, so for example a fast-20/wide SCSI link using 16 bits transfers 40mb/sec.

What are those ``LUN'' numbers you see when you boot up? Think of them as sub-addresses on the SCSI bus. Most SCSI devices have only one ``logical'' device inside them, thus they're LUN zero. Some SCSI devices can, however, present more than one separate logical unit to the bus master, with different LUNs (0 through 7). The only context in which you'll normally use LUNs is with CD-ROM juke boxes. Some have been marketed that offer up to 7 CD-ROMS with one read head. These use the LUN to differentiate which disk to select.

(There's history behind this. Back in the days of EISA, drives were supposed to be under the control of a separate SCSI controller, which could handle up to 7 such devices (15 for wide SCSI). These drives were to be the Logical Units; hence the LUN, or Logical Unit Number. Then, up to 7 of these SCSI controllers would be run by the controller that we today consider the SCSI controller. In practice, hardware cost dropped so rapidly, and capability increased so rapidly, it became more logical to embed the controller on the drive.)

Here are a couple of rules and heuristics to follow:

Rule 1: Total SCSI cable length (both external and internal devices) must not exceed six meters. For modern Ultra SCSI (with its higher speed) cut that to three feet!

It's probably not a good idea to cable 20MB/s or faster SCSI devices externally at all. If you must, one of our informants advises using a Granite Digital ``perfect impedance'' teflon cable (or equivalent); these cables basically provide a near-perfect electrical environment for a decent price, and can be ordered in custom configurations if needed.

A common error is to forget the length of the ribbon cable used for internal devices when adding external ones (that is, devices chained to the SCSI board's external connector).

Rule 2: Both ends of the bus have to be electrically terminated.

On older devices this is done with removable resistor packs --- typically 8-pin-inline widgets, yellow or blue, that are plugged into a plastic connector somewhere near the edge of the PCB board on your device. Peripherals commonly come with resistor packs plugged in; you must remove the packs on all devices except the two end ones in the physical chain.

Newer devices advertised as having "internal termination" have a jumper or switch on the PCB board that enables termination. These devices are preferable, because the resistor packs are easy to lose or damage.

Rule 3: No more than seven devices per chain (fifteen for Wide SCSI).

There are eight SCSI IDs per controller. The controller reserves ID 7 or 15, so your devices can use IDs 0 through 6 (or 0 through 14, wide). No two devices can share an ID; if this happens by accident, neither will work.

The conventional ID assignments are: Primary hard disk = ID 0, Secondary hard disk = ID 1, Tape = ID 2. Some Unixes (notably SCO) have these wired in. You select a device's ID with jumpers on the PCB or a thumbwheel.

SCSI IDs are completely independent of physical device chain position.

Heuristic 1: Stick with controllers and devices that use the Centronics-style 50-pin connector. Internally these connectors are physically identical to diskette cables. Externally they use a D50 shell. This "standard" connector is common in the desktop/tower/rackmount-PC world, but you'll find lots of funky DIN and mini-DIN plugs on devices designed for Macintosh boxes and some laptops. Ask in advance and don't get burned.

Heuristic 2: For now, when buying a controller, go with an Adaptec xx42 or one of its clones such as the BusLogic 542. (I like the BusLogic 946 and 956, two particularly fast Adaptec clones well-supported under Linux.) The Adaptec is the card everybody supports and the de-facto standard. Occasional integration problems have been reported with Unix under Future Domain and UltraStor cards, apparently due to command-set incompatibilities. At least, before you buy these, make sure your OS explicitly supports them.

However: Beware the combination of an Adaptec 1542 with a PCI Mach32 video card. Older (1.1) Linux kernels handled it OK, but all current ones choke. Your editor had to replace his 1542 because of this, swearing sulphurously the while.

Heuristic 3: You'll have fewer hassles if all your cables are made by the same outfit. (This is due to impedence reflections from minor mismatches. You can get situations where cable A will work with B, cable B will work with C, but A and C aren't happy together. It's also non-commutative. The fact that `computer to A to B' works doesn't mean that `computer to B to A' will work.

4. Beware Cheap SCSI Cables!

Mark Sutton tells the following instructive horror story in a note dated 5 Apr 1997:

I recently added an additional SCSI hard drive to my home machine. I bought an OEM packaged Quantum Fireball 2 gig SCSI drive (meaning, I bought a drive in shrinkwrap, without so much as mounting hardware or a manual. Thank God for Quantum's web page or I would have had no idea how to disable termination or set the SCSI ID on this sucker. Anyway, I digress...). I stuck the drive in an external mounting kit that I found in a pile of discarded computer parts at work and my that boss said I could have. (All 5 of my internal bays were full of devices.)

Anyway, I had my drive, and my external SCSI mounting kit, I needed a cable.

I went into my friendly local CompUSA in search of a SCSI cable, and side-by-side, on two hooks, were two "identical" SCSI cables. Both were 3 feet. Both had centronics to centronics connectors, both were made by the same manufacturer. They had slightly different model numbers. One was $16.00, one was $30.00. Of course, I bought the $16 cable.

Bad, I say, BAD BAD MISTAKE. I hooked this sucker up like so:

 ----------  ---------   -------------   ---------
 |Internal|--|Adaptec|---|New Quantum|---|UMAX   |
 |Devices |  |1542CF | ^ |  Disk     | ^ |Scanner|
 ----------  --------  | ------------- | ---------
                       |               |
                   New $16 cable   Cable that came
                                     with scanner.
Shortly after booting, I found that data all over my old internal hard drive was being hosed. This was happening in DOS as well as in Linux. Any disk access on either disk was hosing data on both disks, attempts to scan were resulting in corrupted scans *and* hosing files on the hard disks. By the time I finished swapping cables around, and checking terminations and settings, I had to restore both Linux and DOS from backups.

I went back to CompUSA, exchanged the $16 cable for the $30 one, hooked it up and had no more problems.

I carefully examined the cables and discovered that the $30 cable contained 24 individual twisted pairs. Each data line was twisted with a ground line. The $16 cable was 24 data wires with one overall grounded shield. Yet, both of these cables (from the same manufacturer) were being sold as SCSI cables!

You get what you pay for.

(Another correspondent guesses that the cheap cable probably said ``Macintosh'' on it. The Mac connector is missing most of its ground pins.)

Trends to Watch For

Disks of less that 2GB capacity simply aren't being manufactured anymore; there's no margin in them. Our spies tell us that all major disk makers retooled their lines a while back to produce 540MB unit platters, which are simply being stacked 2N per spindle to produce ranges of drives with roughly 1GB increments of capacity. The highest reasonably-priced drives are still 9GB (16 platters per drive), but you can get 23GB or even 45GB capacities (these are probably packing 2.4GB per platter).

Average drive latency is inversely proportional to the disk's rotational speed. For years, most disks spun at 3600 rpm; most high-performance disks now spin at 7,200 rpm, and high-end disks like the Seagate Cheetah line are moving to 10,000 rpm. These fast-spin disks run extremely hot; expect cooling to become a critical constraint in drive design.

Drive densities have reached the point at which standard inductive read/write heads are becoming a bottleneck. In newer designs, expect to start seeing magnetoresistive head assemblies with separate read and write elements.

More Resources

There's a USENET SCSI FAQ. Also see the home page of the T10 committee that writes SCSI standards.

There is a large searchable database of hard disk and controller information at the PC DISK Hardware Database.


Eric S. Raymond