Krishna Junnikar: RAID

Q. What is Basic Disk?

A. A standard disk with standard partitions (primary and extended).

Q. What is Dynamic Disk?

A. Disks that have dynamic mounting capability to add additional local or remote partitions or directories to a disk drive. These are called dynamic volumes. This is new with the Windows 2000 operating system and is not supported by any other operating systems. Any volume that is on more than one hard drive must be created with dynamic disks. A disk can only be converted from dynamic to basic by first deleting all the volumes in the dynamic disk.

Q. What is RAID?

A. RAID (Redundant Array of Independent Disks). A collection of disk drives that offers increased performance and fault tolerance. There are a number of different RAID levels. The three most commonly used are 0, 1, and 5: Level 0: striping without parity (spreading out blocks of each file across multiple disks). Level 1: disk mirroring or duplexing. Level 2: bit-level striping with parity Level 3: byte-level striping with dedicated parity.

Q. What is Simple Volume?

A. Simple volumes are the most common volumes and the type of volume that you will create most often. If you are using a single disk configuration, a simple volume is the only volume type that you can create.

Q. What is Spanned Volume?

A. Spanned volumes are created by combining disk space from two or more hard disks. Spanned volumes can be created by using different amounts of space from different hard disks. For example, a 10GB spanned volume can be created from 6GB of unallocated space on hard drive 0, 3GB of unallocated space on hard drive 1, and 1GB of space on hard drive 2. A spanned volume cannot be extended, and there is no fault tolerance in using a spanned volume. If any of the drives fail, the data on the volume is lost and must be restored from backup (tape). Spanned volumes can be created from two physical disks and can contain up to 32 physical disks.

Q. What is Mirrored Volume?

A. Mirrored volumes are created using two physical disks. A mirrored volume requires same amount of unallocated space on each of the physical disk used. When data is written to a mirrored volume, the data is written to disk and then synchronized on the second disk. An exact copy of the data is available on both physical disks.

Q. What is Stripped Volume?

A. A striped volume is created using a minimum of two and a maximum of 32 physical drives to create a single volume. A striped volume is created by using an equal amount of unallocated space on all the physical disks.

The data is written across all physical disks in the volume in equal parts, thereby creating a stripe pattern. When data is written to the volume, it is divided into 64KB parts and each part is written to a separate disk. Chopping the data into pieces allows each physical disk to be performing a write operation at almost exactly the same time, thereby increasing speed dramatically. When data is read, it is read in the same way, in 64KB blocks at a time. Striped volumes provide the best read and write performance of all the different types of volumes. A striped volume gets its name from how the data is read and accessed on the drive.

Q. What is Raid-0?

A. RAID Level 0 is not redundant, hence does not truly fit the "RAID" acronym. In level 0, data is split across drives, resulting in higher data throughput. Since no redundant information is stored, performance is very good, but the failure of any disk in the array results in data loss. This level is commonly referred to as striping.

Q. What is RAID-1?

A. RAID Level 1 provides redundancy by writing all data to two or more drives. The performance of a level 1 array tends to be faster on reads and slower on writes compared to a single drive, but if either drive fails, no data is lost. This is a good entry-level redundant system, since only two drives are required; however, since one drive is used to store a duplicate of the data, the cost per megabyte is high. This level is commonly referred to as mirroring.

Q. What is RAID-5?

A. RAID Level 5 is similar to level 4, but distributes parity among the drives. This can speed small writes in multiprocessing systems, since the parity disk does not become a bottleneck. Because parity data must be skipped on each drive during reads, however, the performance for reads tends to be considerably lower than a level 4 array. The cost per megabyte is the same as for level 4.

RAID stands for Redundant Array of Inexpensive Disks.

RAID is the organization of multiple disks into a large, high performance logical disk.

Disk arrays stripe data across multiple disks and access them in parallel to achieve:

* Higher data transfer rates on large data accesses and

* Higher I/O rates on small data accesses.

Data striping also results in uniform load balancing across all of the disks, eliminating hot spots that otherwise saturate a small number of disks, while the majority of disks sit idle.

But....

Large disk arrays, however are highly vulnerable to disk failures. A disk array with a hundred disks is a hundred times more likely to fail than a single disk. An MTTF (mean-time-to-failure) 500,000 hours for a single disk implies an MTTF of 500,000/100 i.e. 5000 hours for a disk array with a hundred disks.

So....

The solution to the problem of lower reliability in disk arrays is to improve the availability of the system. This can be achieved by employing redundancy in the form of error-correcting codes to tolerate disk failures. A redundant disk array can now retain data for much longer time than an unprotected single disk.

Do not confuse between reliability and availability.

Reliability is how well a system can work without any failures in its components. If there is a failure, the system was not reliable.

Availability is how well a system can work in times of a failure. If a system is able to work even in the presence of a failure of one or more system components, the system is said to be available.

Redundancy improves the availability of a system, but cannot improve the reliability. Reliability can only be increased by improving manufacturing technologies or using lesser individual components in a system.

Disadvantages due to Redundancy

Every time there is a write operation, there is a change of data. This change also, has to be reflected in the disks storing redundant information. This worsens the performance of writes in redundant disk arrays significantly compared to the performance of writes in non redundant disk arrays.

Also, keeping the redundant information consistent in the presence of concurrent I/O operation and the possibility of system crashes can be difficult.

The need for RAID can be summarized in two points given below. The two keywords are Redundant and Array.

* An array of multiple disks accessed in parallel will give greater throughput than a single disk.

* Redundant data on multiple disks provides fault tolerance.

Provided that the RAID hardware and software perform true parallel accesses on multiple drives, there will be a performance improvement over a single disk.

With a single hard disk, you cannot protect yourself against the costs of a disk failure, the time required to obtain and install a replacement disk, reinstall the operating system, restore files from backup tapes, and repeat all the data entry performed since the last backup was made.

With multiple disks and a suitable redundancy scheme, your system can stay up and running when a disk fails, and even while the replacement disk is being installed and its data restored.

To create an optimal cost-effective RAID configuration, we need to simultaneously achieve the following goals:

* Maximize the number of disks being accessed in parallel.

* Minimize the amount of disk space being used for redundant data.

* Minimize the overhead required to achieve the above goals.

There are 2 important concepts to be understood in the design and implementation of disk arrays:

1. Data striping, for improved performance.

2. Redundancy for improved reliability.

Data Striping

Data striping transparently distributes data over multiple disks to make them appear as a single fast, large disk. Striping improves aggregate I/O performance by allowing multiple I/Os to be serviced in parallel. There are 2 aspects to this parallelism.

* Multiple, independent requests can be serviced in parallel by separate disks. This decreases the queuing time seen by I/O requests.

* Single, multiple block requests can be serviced by multiple disks acting in co-ordination. This increases the effective transfer rate seen by a single request. The performance benefits increase with the number of disks in the array. Unfortunately, a large number of disks lowers the overall reliability of the disk array.

Most of the redundant disk array organizations can be distinguished based on 2 features:

1. the granularity of data interleaving and

2. the way in which the redundant data is computed and stored across the disk array.

Data interleaving can be either fine grained or coarse grained.

Fine grained disk arrays conceptually interleave data in relatively small units so that all I/O requests, regardless of their size, access all of the disks in the disk array. This results in very high data transfer rate for all I/O requests but has the disadvantages that only one logical I/O request can be in service at any given time and all disks must waste time positioning for every request.

Coarse grained disk arrays interleave data in relatively large units so that small I/O requests need access only a small number of disks while large requests can access all the disks in the disk array. This allows multiple small requests to be serviced simultaneously while still allowing large requests to see the higher transfer rates afforded by using multiple disks.

Redundancy

Since larger number of disks lower the overall reliability of the array of disks, it is important to incorporate redundancy in the array of disks to tolerate disk failures and allow for the continuous operation of the system without any loss of data.

The incorporation of redundancy in disk arrays brings up two problems:

1. Selecting the method for computing the redundant information. Most redundant disks arrays today use parity, though some use Hamming or Reed-Solomon codes.

2. Selecting a method for distribution of the redundant information across the disk array. The distribution method can be classified into 2 different schemes:

* Schemes that concentrate redundant information on a small number of disks.

* Schemes that distribute redundant information uniformly across all of the disks.

Such schemes are generally more desirable because they avoid hot spots and other load balancing problems suffered by schemes that do not uniformly distribute redundant information.

Finally, it is important to mention that selecting between the many possible data striping and redundancy schemes involves complex trade offs between reliability, performance and cost, which have been discussed in the next few sections.

There are many types of RAID and some of the important ones are introduced below:

• Level 0 -- Striped Disk Array without Fault Tolerance: Provides data striping (spreading out blocks of each file across multiple disk drives) but no redundancy. This improves performance but does not deliver fault tolerance. If one drive fails then all data in the array is lost.

• Level 1 -- Mirroring and Duplexing: Provides disk mirroring. Level 1 provides twice the read transaction rate of single disks and the same write transaction rate as single disks.

• Level 2 -- Error-Correcting Coding: Not a typical implementation and rarely used, Level 2 stripes data at the bit level rather than the block level.

• Level 3 -- Bit-Interleaved Parity: Provides byte-level striping with a dedicated parity disk. Level 3, which cannot service simultaneous multiple requests, also is rarely used.

• Level 4 -- Dedicated Parity Drive: A commonly used implementation of RAID, Level 4 provides block-level striping (like Level 0) with a parity disk. If a data disk fails, the parity data is used to create a replacement disk. A disadvantage to Level 4 is that the parity disk can create write bottlenecks.

• Level 5 -- Block Interleaved Distributed Parity: Provides data striping at the byte level and also stripe error correction information. This results in excellent performance and good fault tolerance. Level 5 is one of the most popular implementations of RAID.

• Level 6 -- Independent Data Disks with Double Parity: Provides block-level striping with parity data distributed across all disks.

• Level 0+1 – A Mirror of Stripes: Not one of the original RAID levels, two RAID 0 stripes are created, and a RAID 1 mirror is created over them. Used for both replicating and sharing data among disks.

• Level 10 – A Stripe of Mirrors: Not one of the original RAID levels, multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these.

• Level 7: A trademark of Storage Computer Corporation that adds caching to Levels 3 or 4.

• RAID S: EMC Corporation's proprietary striped parity RAID system used in its Symmetrix storage systems.

The distribution of data across multiple drives can be managed either by dedicated hardware or by software. Additionally, there are hybrid RAIDs that are partially software and hardware-based solutions.

Software-based

Software implementations are now provided by many operating systems. A software layer sits above the (generally block-based) disk device drivers and provides an abstraction layer between the logical drives (RAID arrays) and physical drives. Most common levels are RAID 0 (striping across multiple drives for increased space and performance) and RAID 1 (mirroring two drives), followed by RAID 1+0, RAID 0+1, and RAID 5 (data striping with parity).

Since the software must run on a host server attached to storage, the processor (as mentioned above) on that host must dedicate processing time to run the RAID software.

Hardware-based

A hardware implementation of RAID requires at a minimum a special-purpose RAID controller. On a desktop system, this may be a PCI expansion card, or might be a capability built into the motherboard. Any drives may be used - IDE/ATA, SATA, SCSI, SSA, Fibre Channel, sometimes even a combination thereof. In a large environment the controller and disks may be placed outside of a physical machine, in a stand alone disk enclosure. The using machine can be directly attached to the enclosure in a traditional way, or connected via SAN. The controller hardware handles the management of the drives, and performs any parity calculations required by the chosen RAID level.

Most hardware implementations provide a read/write cache which, depending on the I/O workload, will improve performance. In most systems write cache may be non-volatile (e.g. battery-protected), so pending writes are not lost on a power failure.

Hardware implementations provide guaranteed performance, add no overhead to the local CPU complex and can support many operating systems, as the controller simply presents a logical disk to the operating system.

Hardware implementations also typically support hot swapping, allowing failed drives to be replaced while the system is running.

Inexpensive RAID controllers have become popular that are simply a standard disk controller with a BIOS extension implementing RAID in software for the early part of the boot process. A special operating system driver then takes over the raid functionality when the system switches into protected mode.

Hot spares

Both hardware and software implementations may support the use of hot spare drives, a pre-installed drive which is used to immediately (and automatically) replace a drive that has failed, by rebuilding the array onto that empty drive. This reduces the mean time to repair period during which a second drive failure in the same RAID redundancy group can result in loss of data, though it doesn't eliminate it completely; array rebuilds still take time, especially on active systems. It also prevents data loss when multiple drives fail in a short period of time, as is common when all drives in an array have undergone similar use patterns, and experience wear-out failures. This can be especially troublesome when multiple drives in a RAID set are from the same manufacturer batch.

Krishna Junnikar

Monday, February 21, 2011

RAID

No comments: