Bianca Schroeder of CMU's Parallel Data Lab published a paper about HDD failures in the real world. For her research she looked at 100,000 drives.
Some of the things she founds:
Almost no difference in replacement rates between SCSI, FC and SATA HDDs.
Failure rate is not constant with age. Rather than a significant infant mortality effect, we see a significant early onset of wear-out degradation.
Vendor MTBF reliability figures are mostly wrong. While the datasheet AFRs are between 0.58% and 0.88%, the observed ARRs range from 0.5% to as high as 13.5%. That is, the observed ARRs by dataset and type, are by up to a factor of 15 higher than datasheet AFRs. Most commonly, the observed ARR values are in the 3%range.
Drives with a 1 million hour MTBF last only about 300,000 hours in reality, which is still plenty.
HDD replacement rates don't enter steady state after the first year o f operation. They seem to steadily increase over time.
One array drive failure means a much higher likelihood of another drive failure. The longer since the last failure, the longer to the next failure.