Big scary Raw SMART value, is my drive dying?!

Seagate SMART RAW value calculator

Convert Seagate SMART RAW values for attributes:
1 Raw_Read_Error_Rate
7 Seek_Error_Rate
195 ECC_On_the_Fly_Count
to actual errors. These values are known to display huge numbers, often leading to unnecessary worries or even panic. The RAW values however represent both an event count and the number of errors: IOW how many reads and how many errors during those. If we consider a hard drives does many read operations, the RAW SMART value will rapidly increase, so it’s quite normal to see huge numbers here. The calculator will output an event counter (number of reads, seeks) and the number of errors.

Handles both decimal and hex values (hex values preceded by 0x, example 0xFE45E3).

The other week I noticed this post on superuser.com:

4295032833, now that is a big scary number! That’s a lot of timeout errors!

Does this mean this drive is bad or even dying? Then how come the normalized values seem to indicate that nothing’s wrong? Some will answer that last question by telling you, you shouldn’t rely on normalized values but on RAW values instead. So then we’d go with big scary number?

No, usually when we see huge numbers it is a vendor specific RAW value and rather than interpreting the RAW value as one, we’re actually dealing with several values that we need to ‘break up’. Some times hard drive manufacturers provide the documentation that can help us do this. In this case we’re dealing with a Seagate drive and this document provides further info: http://t1.daumcdn.net/brunch/service/user/axm/file/zRYOdwPu3OMoKYmBOby1fEEQEbU.pdf.

What does this big scary number mean?

From the document we learn:

3.11 Attribute ID 188: Command Timeout Count
Normalized Command Timeout Count = 100 – Command Timeout Count .

This attribute tracks the number of command time outs as defined by an active command being interrupted by a HRESET and COMRESET or SRST or another command
The normalized value is only computed when the number of commands is in the range 103 to 104. The CommandCount and ErroCount are cleared when Number Of Commands reaches 104. The error count used to compute normalized value is not reported in attribute Raw value. It is reported in vendor info area of Attribute sector, bytes 474:475. If Command Timeout Count is > 99, normalize value of 1 is reported. The initial Worst Value is set to 0xFD as a special case.

Raw Usage
Raw [1 – 0] = Total # of command timeouts, with Max hold of FFFFh
Raw [3 – 2] = Total # of commands with > 5 second completion, including those > 7.5 seconds
Raw [5 – 4] = Total # of commands with > 7.5 second completion

Decoding the huge Raw SMART value

Okay, we need that last section “Raw usage” to interpret our big scary number. We see it is not one value.

Big scary number is So 4295032833, we need to break that into 3 and for that we need to convert it to HEX. So we get HEX = 0x100010001.

It may just be me, but that looks less scary already!

We need to break this into 3 separate word values so we get 0x0001, 0x0001 and 0x0001. Using the Seagate document I then decode this as we had one time-out error, and one that took > 5 seconds to complete and one that took longer than 7.5 seconds to complete. Or in other words, we had one error that took longer than 7.5 seconds to complete.

Summarizing, big scary decimal Raw SMART value 4295032833 tells us one command timeout error that took longer than 7.5 seconds to complete occurred.

which I would not worry about too much, could be due to someone bumping into a desk which the computer was on.

Another example:

Attribute ID 7: Seek Error Rate
Monitor seeks requiring one or more retries. Exclude calibration seeks and seeks in system area. Normalized Seek Error Rate = 10 * log10(SeekCount / SeekErrors) which is only updated when SeekCount is in the range 106 to 109. The counts are cleared when SeekCount = 109 . (Evaluates to a value from 1 to 100).
Raw Usage
Raw [3 – 0] = Number of seeks
Raw [5 – 4] = Number of seek errors

So, The raw value of the SMART attribute occupies 48 bits. Seagate’s Seek Error Rate attribute consists of two parts — a 16-bit count of seek errors in the uppermost 4 nibbles, and a 32-bit count of seeks in the lowermost 8 nibbles.

Now assume we see:

ID                               Current  Worst    ThresholdData   RAW          
(07) Seek Error Rate             85       60       30              359872048  Ok

We get: 359872048 > 0x000015733630 > 0 errors, 359872048 seeks. So no errors.

5 thoughts on “Big scary Raw S.M.A.R.T. values aren’t always bad news!”

SWEETGOOD May 13, 2024

Thanks a lot for the detailed explanation in this article which motivated me to write a shell script which automatically parses and checks the return values with monit for all my SEAGATE HDDs.

Anyone who is interested can find the script here: https://codeberg.org/SWEETGOOD/shell-scripts#parse-raw-smart-values-seagate-sh

Ciao 😎

Reply ↓