Search: in
RAID 5 write hole
RAID 5 write hole in Encyclopedia Encyclopedia
  Tutorials     Encyclopedia     Videos     Books     Software     DVDs  
       





RAID 5 write hole

The RAID write hole is a known data corruption issue in older and low-end RAID arrays, caused by interrupted destaging of writes to disk.[1]

Contents


Issue

Consider the following configuration with 2 disks and a parity disk in a RAID 3 configuration:

Device 1 Device 2 Parity device
1 0* 1(stripe sum is odd)
0* 0 0(stripe sum is even)
1 1 0(stripe sum is even)

The disk is being written to when an adverse situation happens, such as a power outage or sudden disk failure. Suppose there are outstanding writes to the areas of disk marked with an asterisk (*) above. The RAID protocol stipulates that a write must happen at the same time as a parity update. Thus the following changes would be made:

Device 1 Device 2 Parity device
1 01 1(stripe sum is odd)0(stripe sum is even)
01 0 0(stripe sum is even)1(stripe sum is odd)
1 1 0(stripe sum is even)

However the writing process is interrupted, so the parity is incorrect. This may only occur on one stripe, or multiple stripes, depending on the implementation of the RAID driver and underlying hardware (due to out-of-order caching, etc.).:

Device 1 Device 2 Parity device
1 1 1(stripe sum is odd - WRONG)
0 0 1(stripe sum is odd - WRONG)
1 1 0(stripe sum is even)

The error may remain undetected indefinitely, because of application-layer redundancy or other practices. However, the main issue happens when a disk in the RAID fails. As can be seen, if any disk fails, the RAID will be rebuilt with incorrect information due to the incorrect parity. As can also be seen, the flipped bits can be anywhere on the virtual RAID device: it might be in a completely unrelated file, in the filesystem metadata, etc. This is known as the "RAID 5 write-hole".

Potential causes

  • Loss of power: the disk is unable to completely write and update parity atomically, due to loss of power.
  • Disk failure: the disk is unable to completely write and update parity atomically, even though it has power.

Mitigation

  • A battery-backed cache on a dedicated hardware RAID controller, and battery-backed hard disks can mitigate issues with loss of power, by allowing the disk to finish writing. This works because parity issues are not at the OS level, but rather at the hardware RAID controller level.
  • A filesystem which is 100% resistant to bit-flips can prevent data corruption (if the write-hole produces sufficiently random bitflip errors).
  • Another obvious, related solution to avoiding disk failures that are caused by power outages, including RAID 5 write hole editing, is the use of Uninterruptible Power Supplies (UPS).

References






Source: Wikipedia | The above article is available under the GNU FDL. | Edit this article



Search for RAID 5 write hole in Tutorials
Search for RAID 5 write hole in Encyclopedia
Search for RAID 5 write hole in Videos
Search for RAID 5 write hole in Books
Search for RAID 5 write hole in Software
Search for RAID 5 write hole in DVDs
Search for RAID 5 write hole in Store




Advertisement




RAID 5 write hole in Encyclopedia
RAID_5_write_hole top RAID_5_write_hole

Home - Add TutorGig to Your Site - Disclaimer

©2011-2013 TutorGig.info All Rights Reserved. Privacy Statement