,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,Stable Storage: RAID Architecture,*,CPRE 545: Fault Tolerant Systems,Basic Building Blocks,Stable Storage: RAID Architecture,CprE,545 Fault-Tolerant Systems (G. Manimaran, ISU),Introduction,Types of Disk Failures,Transient Failures,Disk behaves unpredictably for a short period of time,Bad Sector,A page becomes corrupted,Controller Failure,Disk controller fails,Disk Failure,Entire disk becomes unreadable,Stable Storage: RAID Architecture,Introduction (contd.),Types of Disk Errors,Read Errors:,Soft read error: Page “a” is good but read returns bad for a short duration,Persistent read error: Page “a” is good but read returns bad for a long duration,Undetected error: page “a” is bad but read returns good,Write Errors:,Null write: Page “a” is unchanged,Bad write: Page “a” becomes (bad, d),Stable Storage: RAID Architecture,Making a single disk system stable,An ordinary single disk system can be made stable by introducing the following operations:,Careful Read:,A read is performed repeatedly until it returns the status good or the page cannot be read after a certain number of times,Careful Write:,Performs a write followed by a read until read returns the status good. This eliminates null write and bad write errors,These operations cannot take care of the decay events (bad sector, etc.),Stable Storage: RAID Architecture,Making a single disk system stable,The decay events can be taking a pair of pages and replicating data in both the pages. The pages are chosen to be decay unrelated. Since the pages are not decay related, at least one page should have the status good.,Stable Read:,Does a careful read from one of the paired pages, and if the result is bad it performs a careful read from the other page.,Stable Write:,Performs careful write into each one of the paired pages one after the other.,Stable Storage: RAID Architecture,Issues,The above discussed single disk mechanisms can handle: soft read error, null write, bad write, and persistent read error.,However it cannot handle events like disk crash.,Although the inconsistencies caused by disk crashes can be handled, the data is unavailable until the crashed node/disk is recovered.,Therefore,multiple disk systems,are designed to provide availability and reliability.,Stable Storage: RAID Architecture,RAID Architecture,RAID: Redundant Array of Inexpensive Disks,Combine multiple small, inexpensive disk drives into a group to yield performance exceeding that of one large, more expensive drive,Appear to the computer as a single virtual drive,Support fault-tolerance by redundantly storing information in various ways,Uses,Data Striping,to achieve better performance,Stable Storage: RAID Architecture,Basic Issues,Two operations performed on a disk,Read() : small or large.,Write(): small or large.,Access Concurrency is the number of simultaneous requests the can be serviced by the disk system,Throughput is the number of bytes that can be read or written per unit time as seen by one request,Data Striping: spreading out blocks of each file across multiple disk drives.,The stripe size is the same as the block size,Stable Storage: RAID Architecture,Basic Issues,The Stripe will introduces a tradeoff between,I/O throughput,and,Access concurrency,Small Stripe means high throughput but no or few access concurrency.,Large strip size provides better access concurrency but less throughput for single request,Stable Storage: RAID Architecture,RAID Levels: RAID-0,No Redundancy,No Fault Tolerance, If one drive fails then all data in the array is lost.,High I/O performance,Parallel I/O,Best Storage efficiency,Stable Storage: RAID Architecture,RAID-1,Disk Mirroring,Poor Storage efficiency.,Best Read Performance: Maybe double.,Poor write Performance: two disks to be written.,Good fault tolerance: as long as one disk of a pair is working then we can perform R/W operations.,MTTF,RAID-1,=,(MTTF / 2) * (MTTF / MTTR),Stable Storage: RAID Architecture,RAID-2,Bit Level Striping.,Uses Hamming Codes, a form of Error Correction Code (ECC).,Can Tolerate Disk one Failure.,# Redundant Disks = O (log (total disks).,Better Storage efficiency than mirroring.,High throughput but no access concurrency.,Expensive write.,Example, for 4 disks 3 redundant disks to tolerate one disk failure,Stable Storage: RAID Architecture,RAID-3,Byte Level Striping with parity.,No need for ECC since the controller knows which disk is in error. So parity is enough to tolerate one disk failure.,Best Throughput, but no concurrency.,Only one Redundant disk is needed.,Stable Storage: RAID Architecture,RAID-4,Block Level Striping.,Stripe size introduces the tradeoff between access concurrency versus throughput.,Block Interleaved parity.,Parity disk is a bottleneck in the case of a small write where we have multiple writes at the same time.,No problems for small or large reads.,Stable Storage: RAID Architecture,RAID-4 cont.,In general writes are very expensive.,Read-Modify-Write,: for a small write you have to read all the disks, modify the parity, and then write it back along with the new data.,Read-Modify-Write is less obvious in large writes.,Stable Storage: RAID Architecture,RAID-5,Block-Level Striping with,Distributed,parity.,Parity is uniformly distributed across disks.,Reduces the parity Bottleneck.,Best small and large read (same as 4).,Best Large write.,Still costly for small write,Stable Storage: RAID Architecture,


