Hard drive failures analyzed

Posted on Saturday, Feb 24 2007 @ 03:45 CET by Thomas De Maesschalck
Bianca Schroeder of CMU's Parallel Data Lab published a paper about HDD failures in the real world. For her research she looked at 100,000 drives.

Some of the things she founds:
  • Almost no difference in replacement rates between SCSI, FC and SATA HDDs.
  • Failure rate is not constant with age. Rather than a significant infant mortality effect, we see a significant early onset of wear-out degradation.
  • Vendor MTBF reliability figures are mostly wrong. While the datasheet AFRs are between 0.58% and 0.88%, the observed ARRs range from 0.5% to as high as 13.5%. That is, the observed ARRs by dataset and type, are by up to a factor of 15 higher than datasheet AFRs. Most commonly, the observed ARR values are in the 3%range.
  • Drives with a 1 million hour MTBF last only about 300,000 hours in reality, which is still plenty.
  • HDD replacement rates don't enter steady state after the first year o f operation. They seem to steadily increase over time.
  • One array drive failure means a much higher likelihood of another drive failure. The longer since the last failure, the longer to the next failure.

  • About the Author

    Thomas De Maesschalck

    Thomas has been messing with computer since early childhood and firmly believes the Internet is the best thing since sliced bread. Enjoys playing with new tech, is fascinated by science, and passionate about financial markets. When not behind a computer, he can be found with running shoes on or lifting heavy weights in the weight room.

    Loading Comments