Search
StarWind is a hyperconverged (HCI) vendor with focus on Enterprise ROBO, SMB & Edge

TBW from SSDs with S.M.A.R.T Values in ESXi

  • May 23, 2016
  • 9 min read
Online Marketing Manager at StarWind. In touch with virtualization world, may know stuff you are interested in.
Online Marketing Manager at StarWind. In touch with virtualization world, may know stuff you are interested in.

Solid-State-Drives are becoming widely implemented in ESXi hosts for caching (vFlash Read Cache, PernixData FVP), Virtual SAN or plain Datastores. Unfortunately, SSDs have  limited lifetime per cell. Its value may range from 1.000 times in consumer TLC SSDs up to 100.000 times in enterprise SLC based SSDs. Lifetime can be estimated by device TBW parameters provided by vendor in its specification, It describes how many Terabytes can be written to the entire device, until the warranty expires.smartctl_in_esxi

As VMWare does not provide convenient and easy way to read RAW S.M.A.R.T values on ESXi hosts, a ported version of smartctl has been created, which is part of smartmontools to ESXi.

Below there is an example of an ESXi Host report without smartctl. The device analyzed is a Samsung SSD 850 EVO M.2 250GB used as a local Datastore. Warranty for this SSD is 75TBW.

ESXCLI can display S.M.A.R.T stats with
esxcli storage core device smart get -d [device]

# esxcli storage core device smart get -d t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
Parameter                     Value  Threshold  Worst
----------------------------  -----  ---------  -----
Health Status                 OK     N/A        N/A
Media Wearout Indicator       N/A    N/A        N/A
Write Error Count             N/A    N/A        N/A
Read Error Count              N/A    N/A        N/A
Power-on Hours                99     0          99
Power Cycle Count             99     0          99
Reallocated Sector Count      100    10         100
Raw Read Error Rate           N/A    N/A        N/A
Drive Temperature             N/A    N/A        N/A
Driver Rated Max Temperature  49     0          34
Write Sectors TOT Count       100    0          100
Read Sectors TOT Count        N/A    N/A        N/A
Initial Bad Block Count       N/A    N/A        N/A                                                                                                                                 

The next table shows the stats provided by ESXCLI, which are a bit more verbose.

# esxcli storage core device stats get -d t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
   Device: t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
   Successful Commands: 93483233
   Blocks Read: 205579211
   Blocks Written: 2123298938
   Read Operations: 3240880
   Write Operations: 90144369
   Reserve Operations: 39107
   Reservation Conflicts: 0
   Failed Commands: 22
   Failed Blocks Read: 0
   Failed Blocks Written: 0
   Failed Read Operations: 0
   Failed Write Operations: 0
   Failed Reserve Operations: 0

ESXi keeps track of all read and write operations to the disk, but the counters get reset when ESXi is rebooted.

And here is the report by smartctl:

# smartctl -d sat --all /dev/disks/t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.0.0] (daily-20160510)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 850 EVO M.2 250GB
Serial Number:    S24BNXAG805065D
LU WWN Device Id: 5 002538 d404b9f9f
Firmware Version: EMT21B6Q
User Capacity:    250,059,350,016 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed May 16 15:25:26 2016 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
[...]
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       5039
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       35
177 Wear_Leveling_Count     0x0013   094   094   000    Pre-fail  Always       -       122
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   049   034   000    Old_age   Always       -       51
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       26
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       6343034492

In the SMART Attributes section, there is a Total_LBAs_Written value (ID #241). In order to get Terabytes, we need to multiply this value with the sector size (512 bytes) and divide by 1099511627776 (1024^4).

                                                Total_LBAs_Written * Sector Size / 1024^4 = TBW

                                      6343034492 * 512 / 1099511627776 = 2.95 TBW

That gives us 3 of 75 TBW. Taking into consideration the parameter Power_On_Hours (SMART ID #9), which tells us that the device has been in use for about 200 days, we may prognose that this SSD will last for the next 13 years.

The smartctl can be obtained from here:

http://www.virten.net/files/smartctl-6.6-4321.x86_64.vib

Note: The use of this VIB is totally unsupported, proceed at your own risk. Tested with ESXi only.

This is the review of an article.

Source: www.virten.net

Related materials:

Hey! Found Oksana’s article helpful? Looking to deploy a new, easy-to-manage, and cost-effective hyperconverged infrastructure?
Alex Bykovskyi
Alex Bykovskyi StarWind Virtual HCI Appliance Product Manager
Well, we can help you with this one! Building a new hyperconverged environment is a breeze with StarWind Virtual HCI Appliance (VHCA). It’s a complete hyperconverged infrastructure solution that combines hypervisor (vSphere, Hyper-V, Proxmox, or our custom version of KVM), software-defined storage (StarWind VSAN), and streamlined management tools. Interested in diving deeper into VHCA’s capabilities and features? Book your StarWind Virtual HCI Appliance demo today!