Online storage firm Backblaze released a new report in which they investigated hard drive SMART metrics to discover which data can be used to predict HDD failures. The company says SMART stats are inconsistent from hard drive to hard drive but found that there are five SMART metrics that indicate impending disk drive failure. You can check it out over here.
A drive is considered to have stopped working when the drive appears physically dead (e.g. won’t power up), doesn’t respond to console commands or the RAID system tells us that the drive can’t be read or written.
To determine if a drive is going to fail soon we use SMART statistics as evidence to remove a drive before it fails catastrophically or impedes the operation of the Storage Pod volume.
From experience, we have found the following 5 SMART metrics indicate impending disk drive failure:
SMART 5 – Reallocated_Sector_Count.
SMART 187 – Reported_Uncorrectable_Errors.
SMART 188 – Command_Timeout.
SMART 197 – Current_Pending_Sector_Count.
SMART 198 – Offline_Uncorrectable.
We chose these 5 stats based on our experience and input from others in the industry because they are consistent across manufacturers and they are good predictors of failure.
Factors like drive temperature were found to have no impact on the disk's failure rate.