[Home] [About Us] [Products] [Data Storage] [Video Storage] [RAID FAQs] [Become A Reseller] [Links] [Site Map]

RAID Improves Speed and Reliability of HDD (hard disk drives)

Introduction to RAID

As application software and data expand, the size of HDDs proportionally increases. Also, our dependency on these applications and the data we generate increases the HDDs workload. This increases the potential and risk of HDD failure. An HDD failure causes not only work disruption, but also the potential loss of valuable data and applications. RAID technologies are one of the solutions to protect and manage these precious digital assets. Although the speeds of buses, CPUs, and video cards have drastically increased, the performance of the whole system still often depends on HDDs. RAID is also one of the leading technologies offering faster access to HDDs. Today, we might even say that the performance of the whole system depends largely on the HDDs. Fortunately, RAID systems are one of the leading technologies that offers faster access to HDDs.

David Patterson, a University of California Professor, defined RAID (Redundant Arrays of Inexpensive Disks). His aim was to dramatically improve speed, security, and reliability by allowing the restoration of the original data in the event of an HDD failure.

Back then, one of the main concerns of engineers was that the I/O speed was not improving in parallel with CPUs and memory. Generally, techniques to improve HDD speeds had come to a stand still.

Seek time, or moving the head to a certain position, and latency time, or waiting for the rotation of the disk, could not be easily improved. The speed difference between the fastest and more expensive HDDs and low-end HDD was no more than twice. As high-end HDDs (referred to as SLED: Single Large Expensive Disks) rapidly increased their capacity, their speed did not improve correspondingly.

Amidst such circumstances, using disk arrays was an innovative idea. By arranging low-end HDDs in an array, it enabled a drastic performance improvement. However, a problem arose from the array's arrangement -- reliability. Because there are more parts in the scheme, there was a larger possibility of a failure. If, for example, 100 HDDs, whose MTBF (Mean Time Before Failure) was over 30 years, were to be used in one array, the MTBF of the whole array could be as low as 18 weeks.

As a countermeasure for this, taking a failure of an HDD in an array as a fact, those engineers developed a method to include redundant information in the array for restoring original data. Thus the birth of RAID.

There are five basic RAID approaches. Please note that RAID 0 or Striping was not included in the original work, but is now considered a useful approach. Therefore, there are six common RAID methods or architectures (0 through 5).

Below are some brief descriptions of these six RAID levels.

RAID 0 or "Striping"

For this level, data is striped (dispersed) and stored in multiple HDDs. Improved performance is sought by reading and writing to the HDDs in the array in parallel.

With a single HDD, which does not form an array, n blocks of data are stored in Block 0 (D0) through Block n (Dn) on the HDD. When retrieving the data, each block is read out in order. (Fig. 1)

In a RAID 0 array, data is read out simultaneously from blocks on each HDD. (Fig. 2) Improved performance can be realized from RAID 0 with applications that read/write relatively large amounts of data. However, there is no redundant information overhead, which makes it possible to use the full capacity of the installed HDDs. Note that it is not possible to restore data in the case of an HDD failure, therefore one cannot expect any improvement in data reliability and protection.

RAID 1 or "Mirroring"

RAID 1 is a method generally known as mirroring. The RAID 1 Controller divides the HDDs in the array into two groups, and writes identical data onto the HDDs in each group. Data is read out from either group. (Fig. 3) In case of an HDD failure, it is possible to recover the data from the other functioning HDD. RAID 1 is popular for its simplicity, making it a frequently used method. Shortcomings are that it halves the actual capacity of the HDD array, and provides no improvement in speed.

RAID 2 - ECC by Striping (byte) and Hamming Code

RAID 2 uses an error checking and correction (ECC) technique widely employed with random access memory (RAM). Generating ECC codes or hamming codes, the controller disperses data at the bit level on multiple HDDs. Fig. 4 illustrates a RAID level 2 scheme. In this case, 8 bits of data are stored on 8 HDDs, while at the same time, 4 bits of ECC are generated and stored similarly in 4 HDDs. It is possible to read and/or write onto those 12 HDDs simultaneously or in parallel. If any of the HDDs fails, the redundancy scheme can restore the original data in real-time, so that the system's operation is not interrupted. However, HDDs today integrate strong error checking/correction functions, and the ECC overhead generated by this method is large compared to RAID 3~5 as will be described below. Consequently, commercial products rarely employ RAID 2.

RAID 3 - Striping (byte) and Parity Drive (fixed)

RAID 3 is a simplified version of RAID 2. Instead of the multiple ECC bits found in RAID 2, bit parity is used. RAID 3 disperses data at the bit level, as in RAID 2. This scheme consists of an array of HDDs for data and one unit for parity. Fig. 5 shows an example of RAID 3. The scheme generates XOR (exclusive-or) parity from bits 0 through bit7. If any of the HDDs fail, it restores the original data by an XOR between the redundant bits on the other HDDs and the parity HDD. With RAID 3, all HDDs operate constantly. It is not a very effective method for accessing small amounts of data, but it is suitable for specialized use where large blocks of data need to be processed at high speed, as in supercomputers.

RAID 4 - Striping (block) and Parity Drive (fixed)

In RAID 4, data dispersion is not by bits but by blocks (ex. "n sectors"). (Fig. 6) HDDs in the array can operate independently. Where RAID 3 constantly accesses all of the HDDs in an array, RAID 4 only does so to the necessary HDDs. Higher speed can be expected in reading data of any size. However, in writing data, old data has to be retrieved from the data HDD and parity HDD, taken XOR with new data, then written with the renewed data and parity. The processing time is longer than for RAID 3. (Fig. 7)

RAID 5 or "Striping (block) and Parity Drive (dispersed)"

In RAID 5, the parity data is dispersed and stored on all of the HDDs in the array. This structure was devised in order to resolve the fact that a single parity HDD becomes a bottleneck for performance in RAID 4. RAID 5 is commonly used in products currently on the market.

For RAID 5, the storage capacity equal to one disk is used to record parity data. This parity data is not stored on a single disk though. Instead, it is equally distributed among all of the HDDs in the array. Even if one of the drives fails, lost data can be reconstructed (regenerated) through the data recorded on the other drives and the distributed parity data. Parity data is scattered to all of the drives in order to avoid performance degradation due to intensive access to a specific parity drive (such as in RAID 3). The effective storage capacity of this RAID level is as follows:

(Effective storage capacity) = (Capacity of one drive) x (Number of drives - 1)