Unreliable SSDs

Today my system froze and failed to reboot. I plugged in an Ubuntu live USB stick and booted from it. Then I discovered the problem that my Intel SSD 320 broke. The output of hdparm is attached to the end of the blog post. You can see that the device size is reduced to 8 MB from 120 GB and the serial number is called BAD_CTX 00000159. The firmware of the SSD was up-to-date and the last firmware update should have fixed the 8 MB bug.

The Intel SSD 320 is my second SSD. My first SSD was a Super Talent Ultradrive GX 64GB, which died after around fifteen month of heavy use. It left a big bunch of my data corruption behind. SSDs seems to be very unreliable. Both SSDs died, but I cant remember that one of my HDDs died.

$ sudo hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
Model Number: INTEL SSDSA2CW120G3
Serial Number: BAD_CTX 00000159
Firmware Revision: 4PC10362
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
Standards:
Used: unknown (minor revision code 0x0029)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16128
LBA user addressable sectors: 16384
LBA48 user addressable sectors: 16384
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
device size with M = 1024*1024: 8 MBytes
device size with M = 1000*1000: 8 MBytes
cache/buffer size = unknown
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* IDLE_IMMEDIATE with UNLOAD
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Phy event counters
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 500151795951d4b9
NAA : 5
IEEE OUI : 001517
Unique ID : 95951d4b9
Checksum: correct

About these ads

18 thoughts on “Unreliable SSDs

  1. If you’re sure you updated the firmware to include the fixed version, then you should contact Intel; they’ll want to know about this and fix it ASAP.

  2. I’ve so far had the opposite experience.
    This is good to know though, I had assumed that the SSD would always be readable and that only writing would be affected when the ‘sector’ died.

  3. Yes, contact intel. I have one of their SSDs, and I’ll want another in the future; I’d prefer their firmware be well-debugged under Linux-based desktops by that time.

  4. i’ve got bad experience with OCZ Vertex2 SSD’s (3 died within 4 months). Good experience with SAMSUNG 470 Series SSD’s (no defects yet).

    But I’m now always doing regular backups since the OCZ desaster. :)

  5. im running and OCZ Vertex SSD in my notebook.
    in smart there is a field named “Remaining Lifetime”, which shows the result of the average erase count per block and the estimated number of erases before failure of blocks (as you might know, the blocks have a limited lifetime regarding writing cycles). You said, you used your ssd quite extensive, so it would be interesting to see that values, if possible.

    • I can’t check the SMART values of the Intel SSD due do the failure (SMART is disabled). The “Remaining Lifetime” field hasn’t changed from the starting value when I checked it the last time. My first SSD had no spare blocks remaining, when it failed.

  6. I hope you had backups. I’ve got ADATA XM11 128GB drive and after about 6 months it works fine, but you’re scarring me a little ;)

  7. try using a different computer and perform low level diagnostics. I bet the drive is under warrenty. Double check your computer hardware for the real problem.

  8. I’ve used my OCZ Vertex2 since Feb. 2011, and no problems so far (smartctl shows 100% SSD Life Left and 0 Retired Block Count in almost 7000 hrs of power on time).

    Anecdotal evidence is unlikely to give us much insight into this topic. But the only studies I know of are at least 2 years old. Have you seen anything recent? I’d be more interested in studies on units used in the field (i.e. subject to the wear and tear of shock, temperature changes, etc.) than in data centres, but data on either would be nice.

Comments are closed.