RAID Tutorial

What is RAID?

RAID stands for "Redundant Array of Inexpensive Drives." The concept of RAID was first described and published in a paper entitled "A Case for Redundant Arrays of Inexpensive Disks (RAID)" by Patterson, Gibson and Katz at the University of California Berkeley in 1987. Funded by Digital Equipment Corporation (the creators of OpenVMS), the research developed and became a standard in the computing industry for applications requiring fast, reliable storage of large volumes of data.

Benefits of RAID

The basic idea of RAID was to combine multiple small, inexpensive disk drives into an array of disk drives to provide speed, reliability, and increased storage capacity. This array of drives appears to the computer as a single logical storage unit or drive. In general, using RAID provides the following benefits:

  • Data redundancy: Helps to protect critical data from hard drive failures. (Not applicable to RAID-0).
  • Fault tolerance: Provides a better over-all storage system.
  • Increased capacity: Provides increased capacity by combining multiple drives.
  • Increased performance: Performance of RAID depending on the RAID level used. For applications that need raw speed, RAID is definitely the way to go.

Types of RAID

There are six types of array architectures for RAID (level RAID-0 through RAID-5). The efficiency of how the total drive storage is used and benefits of RAID depend on the scheme or level of RAID being used. A brief description for each level is listed below:

  1. RAID-0 or "Striping"

    Striping offers high I/O rates since read/write operations may be performed simultaneously on multiple drives. Data is split and written across drives one segment at a time, resulting in higher data throughput.

    RAID-0 is the fastest and most efficient array type since no redundant information is stored. In addition, it offers no fault-tolerance (which prevents the failure of any disk in the array results in data loss).

  2. RAID-1 or "Mirroring"

    RAID-1 is a good entry-level redundant system since only two drives are required. The architecture provides redundancy by writing all data to two or more drives simultaneously. If one drive fails, data can still be retrieved from the other member of the RAID set. This is an optima choice for performance-critical, fault-tolerant environments and the only choice for fault-tolerance if no more than two drives are desired.

    RAID-1 is the most expensive RAID option since one drive is used to store a duplicate of the data. By doubling storage requirements, cost per megabyte is high. On the other hand, RAID-1 offers the ultimate in reliability. It provides faster on reads and slower on writes compared to a single drive, but if either drive fails, no data is lost.

  3. RAID-2

    RAID-2 is seldom used today since ECC is embedded in almost all modern disk drives. It uses Hamming error correction codes and is intended for use with drives which do not have built-in error detection. All SCSI drives support built-in error detection, so this level is of little use when using SCSI drives.

  4. RAID-3

    RAID-3 is often used in data intensive or single-user environments which access long sequential records to speed up data transfer. The architecture does not allow multiple I/O operations to be overlapped and requires synchronized-spindle drives in order to avoid performance degradation with short records. In addition, it stripes data at a byte level across several drives, with parity stored on one drive. (Byte-level striping requires hardware support for efficient use.)

  5. RAID-4

    RAID-4 offers no advantages over RAID-5 and does not support multiple simultaneous write operations. The architecture stripes data at a block level across several drives, with parity stored on one drive. The parity information allows recovery from the failure of any single drive.

    Performance for RAID-4 is very good for reads (the same as level 0). Writes, especially small random writes, however, require that parity data be updated each time. Because only one drive in the array stores redundant data, the cost per megabyte of a level 4 array can be fairly low.

  6. RAID-5

    RAID-5 is the best choice in multi-user environments which are not write performance sensitive. However, at least three, and more typically five drives are required for RAID-5 arrays. The architecture is similar to level 4, but distributes parity among the drives. It employs a combination of striping and parity checking. The use of parity checking provides redundancy without the overhead of having to double disk capacity. This can speed small writes in multiprocessing systems, since the parity disk does not become a bottleneck. Simply put, parity checking involves determining whether each given block has an odd or even value. These values are summed across the stripe sets to obtain a parity value. With this parity value, the contents of a failed disk can easily be determined and rebuilt on a spare drive.

    The performance for reads tends to be considerably lower than a level 4 array and the cost per megabyte is the same as for level 4.

  7. Mirrored-Striped (RAID 10)

    Benefit of both high performance and fault tolerance from RAID 0 and 1. The capacity will be the amount of the smallest drive multiplied by two.

  8. BIG (Concatenation)

    Concatenation combines the capacity of drives for increased storage in a single volume. Unlike RAID 0, the drives are joined "back to back", so the full capacity of each drive is combined together rather than only the capacity of the smallest drive. The failure of a drive in a concatenation usually ends up with the loss of only the failed drive, but special tools would be required to recover data from the other drives. Large possible logical drive capacity, yet no fault tolerance, and no increase in data transfer speed.

  9. SAFE50

    Creates two volumes across two drives, with half of each physical drive being allocated to each volume. One volume is in SAFE mode (RAID 1) and the other is in BIG ( Concatenated) mode.

  10. SAFE33

    Creates two volumes, with 1/3 of physical space being dedicated to SAFE mode and 2/3 to BIG mode.

  11. GUI

    The GUI setting allows configuration of RAID setting to be performed using the included software rather than the rotary switch. 2x1 eSATA / USB 2.0 Hardware Port Multiplier must be connected onto a PM-aware host controller (such as controllers using SiI3132 or SiI3124 chip) or through USB 2.0 to use this option. GUI allows more flexibility than the rotary switch, such as allowing creation of RAID volumes with different drive capacities on different RAID sets. When the rotary switch is configured as GUI, a new option "Configure Box" will appear in the SteelVine Manager to manage RAID through the software.