← Back to team overview

kernel-packages team mailing list archive

[Bug 1092622] Re: disks hang as atasmart is sending illegal smart command to "green" disk

 

The right thing to do here is to stop using read_threshold and find a
suitable replacement. Blacklisting a command to a drive because it does
the right thing when faced with an illegal command is ridiculous.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to libatasmart in Ubuntu.
https://bugs.launchpad.net/bugs/1092622

Title:
  disks hang as atasmart is sending illegal smart command to "green"
  disk

Status in “libatasmart” package in Ubuntu:
  Confirmed

Bug description:
  SUMMARY:

  Model Family:     Western Digital Caviar Green (Adv. Format)
  Device Model:     WDC WD20EARX-32PASB0

  Appears to be sensitive to illegal smart commands.
  The entire drive stalls to handle this creating a massive interruption
  (HANG) of service until the error is handled.

  SOLUTION:

  Actually check to see if a command is supported before blindly
  sending it to a target.

  DETAILS:

  Dec 17 14:57:00 goblin kernel: [ 1274.630081] ata7.00: exception Emask
  0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
  Dec 17 14:57:00 goblin kernel: [ 1274.630087] ata7.00: failed command: SMART
  Dec 17 14:57:00 goblin kernel: [ 1274.630094] ata7.00: cmd
  b0/d1:01:00:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
  Dec 17 14:57:00 goblin kernel: [ 1274.630094]          res
  40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
  Dec 17 14:57:00 goblin kernel: [ 1274.630096] ata7.00: status: { DRDY }
  Dec 17 14:57:00 goblin kernel: [ 1274.630101] ata7: hard resetting link
  Dec 17 14:57:01 goblin kernel: [ 1275.121558] ata7: SATA link up 3.0
  Gbps (SStatus 123 SControl 300)
  Dec 17 14:57:01 goblin kernel: [ 1275.137456] ata7.00: configured for
  UDMA/133
  Dec 17 14:57:01 goblin kernel: [ 1275.137496] ata7: EH complete

  This was traced down to this offending process.

  /usr/lib/udisks2/udisksd

  which is linked to libatasmart

  What the actual offending command is SMART (B0) with feature (D1)
  READ THRESHOLDS. D1 is unsupported in recent versions of ATA, I only
  have specs that go back to ATA-6. According to the kernel sources, this
  feature is only used in the legacy IDE stack so it's very very old[1].

  What I found is udisk is calling these functions:
  src/udiskslinuxdriveata.c:  if (sk_disk_smart_read_data (d) != 0)
  src/udiskslinuxdriveata.c:                   "sk_disk_smart_read_data: %m");
  src/udiskslinuxdriveata.c:  if (sk_disk_smart_status (d, &good) != 0)
  src/udiskslinuxdriveata.c:                   "sk_disk_smart_status: %m");
  src/udiskslinuxdriveata.c:  if (sk_disk_smart_self_test (d, test) != 0)
  src/udiskslinuxdriveata.c:                   "sk_disk_smart_self_test: %m");

  which all call a static function in libatasmart called "smart_init" unconditionally,
  which then unconditionally calls "disk_smart_read_thresholds".

  static int disk_smart_read_thresholds(SkDisk *d) {
          uint16_t cmd[6];
          int ret;
          size_t len = 512;
  ...
          cmd[0] = htons(SK_SMART_COMMAND_READ_THRESHOLDS);
          cmd[1] = htons(1);
          cmd[2] = htons(0x0000U);
          cmd[3] = htons(0x00C2U);
          cmd[4] = htons(0x4F00U);
  }
  ...
  typedef enum SkSmartCommand {
          SK_SMART_COMMAND_READ_DATA = 0xD0,
          SK_SMART_COMMAND_READ_THRESHOLDS = 0xD1,

  Which is our offending feature.

  So anything that calls:

  int sk_disk_smart_read_data(SkDisk *d) {
  int sk_disk_smart_status(SkDisk *d, SkBool *good) {
  int sk_disk_smart_self_test(SkDisk *d, SkSmartSelfTest test) {

  is impacted by this.

  REPRODUCTION:

  A small C program to trigger a read_threshold on demand, it should create
  a syslog trace and a HARD RESET to compose itself.

  # sudo apt-get install -y libatasmart-dev
  # gcc -o read_thresh_test skreadthreshold.c -latasmart
  # sudo ./read_thresh_test /dev/sdX, where X is suspect device

  /*
   * Author: Peter M. Petrakis <peter.petrakis@xxxxxxxxxxxxx>
   *
   * skreadthreshold.c
   *
   * Trigger HARD RESET by sending illegal smart command
   * READ THRESHOLD, to modern, standard's compliant disks
   *
   * # sudo apt-get install -y libatasmart-dev
   * # gcc -o read_thresh_test skreadthreshold.c -latasmart
   * # sudo ./read_thresh_test /dev/sdX, where X is suspect device
   */

  #include <string.h>
  #include <errno.h>
  #include <stdio.h>

  #include "atasmart.h"

  int main(int argc, char *argv[]) {
          int ret;
          const char *device;
          SkDisk *d;

          device = argv[1];

          if ((ret = sk_disk_open(device, &d)) < 0) {
                  fprintf(stderr, "Failed to open disk %s: %s\n", device, strerror(errno));
                  return 1;
          }

          /*
           * disk_smart_read_thresholds is static and called
           * by init_smart unconditionally
           */
          ret = sk_disk_smart_read_data(d);
          fprintf(stdout, "read threshold returned: %d\n", ret);

          sk_disk_free(d);

          return 0;
  }

  1. drivers/ide/ide-disk_proc.c
  return __idedisk_proc_show(m, m->private, ATA_SMART_READ_THRESHOLDS);

  ProblemType: Bug
  DistroRelease: Ubuntu 12.10
  Package: libatasmart4 0.19-1git1
  ProcVersionSignature: Ubuntu 3.5.0-21.32-generic 3.5.7.1
  Uname: Linux 3.5.0-21-generic x86_64
  ApportVersion: 2.6.1-0ubuntu9
  Architecture: amd64
  Date: Thu Dec 20 11:59:08 2012
  InstallationDate: Installed on 2012-09-28 (83 days ago)
  InstallationMedia: Lubuntu 12.10 "Quantal Quetzal" - Beta amd64 (20120926)
  MarkForUpload: True
  ProcEnviron:
   LANGUAGE=en_US:en
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/usr/bin/zsh
  SourcePackage: libatasmart
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libatasmart/+bug/1092622/+subscriptions