The use of parity bits is a common method of detecting errors in data transmission and storage.
Before looking at the use of parity bits in RAID, let's look more generally at their use as a method of error detection.
In data transmission, if data is sent from one device to another without any form of error correcting mechanism, the receiving device must assume the data is correct. The number of errors during data transmissions is much lower with digital transmission than when analogue transmission was prevalent. However, data transmission is rarely entirely error-free, so it is unwise to assume that data received is exactly the same as was transmitted. The use of a parity bit is a way of adding checksums into data that can enable the target device to determine whether the data has been received correctly. Using a parity bit is a simple way of checking for errors. Basically, a single data bit is added to the end of a data block to ensure the number of bits in the message is either odd or even. For example, if even parity is used, the receiving device will know that every correct message must contain an even number of bits; otherwise, there has been an error and the source device must resend. In practice, RAID devices use enhanced forms of parity checking such as vertical and horizontal parity. Some RAID groups -- such as RAID 4 or RAID 5 -- use one or more disk drives that contain parity information to enable them to rebuild data in case of a drive failure. When data is written to a RAID group, it will always have the correct parity as it will have gone through various error checking algorithms. So, if a drive in the RAID group fails, the system can then use the information held on the remaining disks plus the parity information to rebuild the data on the failed drive to a spare.
So, how does it do this? Let's assume our RAID group is using even parity. If that's the case, it can easily ascertain (using an XOR comparison) what was on the failed drive by adding up the bits on the remaining drives. If the data on the remaining drives adds up to an odd number, then the information on the failed drive must have been a 1 to maintain even parity. Alternatively, if the data on the remaining drives adds up to an even number, then the data on the failed drive must have been a 0.
This was first published in January 2011