kernel-packages team mailing list archive

Thread
Date
[Bug 1469829] Re: ppc64el should use 'deadline' as default io scheduler

To: kernel-packages@xxxxxxxxxxxxxxxxxxx
From: Launchpad Bug Tracker <1469829@xxxxxxxxxxxxxxxxxx>
Date: Mon, 28 Sep 2015 15:47:08 -0000
Reply-to: Bug 1469829 <1469829@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
This bug was fixed in the package linux - 3.13.0-65.105

---------------
linux (3.13.0-65.105) trusty; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1498108

  [ Upstream Kernel Changes ]

  * net: Fix skb_set_peeked use-after-free bug
      - LP: #1497184

linux (3.13.0-64.104) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1493803

  [ Chris J Arges ]

  * [Config] DEFAULT_IOSCHED="deadline" for ppc64el
    - LP: #1469829

  [ Upstream Kernel Changes ]

  * tcp: fix recv with flags MSG_WAITALL | MSG_PEEK
    - LP: #1486146
  * libceph: abstract out ceph_osd_request enqueue logic
    - LP: #1488035
  * libceph: resend lingering requests with a new tid
    - LP: #1488035
  * n_tty: Refactor input_available_p() by call site
    - LP: #1397976
  * tty: Fix pty master poll() after slave closes v2
    - LP: #1397976
  * md: use kzalloc() when bitmap is disabled
    - LP: #1493305
  * ata: pmp: add quirk for Marvell 4140 SATA PMP
    - LP: #1493305
  * libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk
    VB0250EAVER
    - LP: #1493305
  * libata: add ATA_HORKAGE_NOTRIM
    - LP: #1493305
  * libata: force disable trim for SuperSSpeed S238
    - LP: #1493305
  * libata: increase the timeout when setting transfer mode
    - LP: #1493305
  * libata: Do not blacklist M510DC
    - LP: #1493305
  * mac80211: clear subdir_stations when removing debugfs
    - LP: #1493305
  * ALSA: hda - Add new GPU codec ID 0x10de007d to snd-hda
    - LP: #1493305
  * drm: Stop resetting connector state to unknown
    - LP: #1493305
  * usb: dwc3: Reset the transfer resource index on SET_INTERFACE
    - LP: #1493305
  * usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init()
    function
    - LP: #1493305
  * xhci: Calculate old endpoints correctly on device reset
    - LP: #1493305
  * xhci: report U3 when link is in resume state
    - LP: #1493305
  * xhci: prevent bus_suspend if SS port resuming in phase 1
    - LP: #1493305
  * xhci: do not report PLC when link is in internal resume state
    - LP: #1493305
  * USB: OHCI: Fix race between ED unlink and URB submission
    - LP: #1493305
  * usb-storage: ignore ZTE MF 823 card reader in mode 0x1225
    - LP: #1493305
  * blkcg: fix gendisk reference leak in blkg_conf_prep()
    - LP: #1493305
  * tile: use free_bootmem_late() for initrd
    - LP: #1493305
  * Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen
    - LP: #1493305
  * md/raid1: fix test for 'was read error from last working device'.
    - LP: #1493305
  * mmc: omap_hsmmc: Fix DTO and DCRC handling
    - LP: #1493305
  * isdn/gigaset: reset tty->receive_room when attaching ser_gigaset
    - LP: #1493305
  * mmc: sdhci-pxav3: fix platform_data is not initialized
    - LP: #1493305
  * mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()
    - LP: #1493305
  * mmc: sdhci-esdhc: Make 8BIT bus work
    - LP: #1493305
  * bonding: correctly handle bonding type change on enslave failure
    - LP: #1493305
  * net: Clone skb before setting peeked flag
    - LP: #1493305
  * bridge: mdb: fix double add notification
    - LP: #1493305
  * usb: gadget: mv_udc_core: fix phy_regs I/O memory leak
    - LP: #1493305
  * inet: frags: fix defragmented packet's IP header for af_packet
    - LP: #1493305
  * bonding: fix destruction of bond with devices different from
    arphrd_ether
    - LP: #1493305
  * ARM: OMAP2+: hwmod: Fix _wait_target_ready() for hwmods without sysc
    - LP: #1493305
  * ASoC: pcm1681: Fix setting de-emphasis sampling rate selection
    - LP: #1493305
  * iscsi-target: Fix use-after-free during TPG session shutdown
    - LP: #1493305
  * iscsi-target: Fix iscsit_start_kthreads failure OOPs
    - LP: #1493305
  * iscsi-target: Fix iser explicit logout TX kthread leak
    - LP: #1493305
  * ALSA: hda - Apply fixup for another Toshiba Satellite S50D
    - LP: #1493305
  * vhost: actually track log eventfd file
    - LP: #1493305
  * xfs: remote attributes need to be considered data
    - LP: #1493305
  * ALSA: usb-audio: add dB range mapping for some devices
    - LP: #1493305
  * drm/radeon/combios: add some validation of lvds values
    - LP: #1493305
  * x86/efi: Use all 64 bit of efi_memmap in setup_e820()
    - LP: #1493305
  * ipr: Fix locking for unit attention handling
    - LP: #1493305
  * ipr: Fix incorrect trace indexing
    - LP: #1493305
  * ipr: Fix invalid array indexing for HRRQ
    - LP: #1493305
  * ALSA: hda - Fix MacBook Pro 5,2 quirk
    - LP: #1493305
  * x86/xen: Probe target addresses in set_aliased_prot() before the
    hypercall
    - LP: #1493305
  * netfilter: ctnetlink: put back references to master ct and expect
    objects
    - LP: #1493305
  * bridge: mdb: fix delmdb state in the notification
    - LP: #1493305
  * ipvs: fix crash with sync protocol v0 and FTP
    - LP: #1493305
  * act_pedit: check binding before calling tcf_hash_release()
    - LP: #1493305
  * netfilter: nf_conntrack: Support expectations in different zones
    - LP: #1493305
  * ipvs: do not use random local source address for tunnels
    - LP: #1493305
  * ALSA: hda - fix cs4210_spdif_automute()
    - LP: #1493305
  * niu: don't count tx error twice in case of headroom realloc fails
    - LP: #1493305
  * net/mlx4_core: Fix wrong index in propagating port change event to VFs
    - LP: #1493305
  * ipvs: fix crash if scheduler is changed
    - LP: #1493305
  * Linux 3.13.11-ckt26
    - LP: #1493305

 -- Brad Figg <brad.figg@xxxxxxxxxxxxx>  Mon, 21 Sep 2015 10:16:41 -0700

** Changed in: linux (Ubuntu Trusty)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1469829

Title:
  ppc64el should use 'deadline' as default io scheduler

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux-lts-utopic source package in Trusty:
  Fix Committed
Status in linux source package in Utopic:
  Won't Fix
Status in linux source package in Vivid:
  Fix Committed

Bug description:
  [Impact]
  Using cfq instead of deadline as the default io scheduler starves certain workloads and causes performance issues. In addition every other arch we build uses deadline as the default scheduler.

  [Fix]
  Change the configuration to the following for ppc64el:
  CONFIG_DEFAULT_DEADLINE=y
  CONFIG_DEFAULT_IOSCHED="deadline"

  [Test Case]
  Boot and cat /sys/block/*/queue/scheduler to see if deadline is being used.

  --

  -- Problem Description --

  Firestone system given to DASD group failed HTX overnight test with miscompare error.
  HTX mdt.hdbuster was running on secondary drive and failed about 12 hours into test

  HTX miscompare analysis:
  ====================-==

  Device under test: /dev/sdb
  Stanza running: rule_3
  miscompare offset: 0x40
  Transfer size: Random Size
  LBA number: 0x70fc
  miscompare length: all the blocks in the transfer size

  *- STANZA 3: Creates number of threads twice the queue depth. Each thread  -*
  *- doing 20000 num_oper with RC operation with xfer size between 1 block   -*
  *- to 256K.                                                                -*

  This miscompare shows read operation is unable to get the expected
  data from the disk. The re-read buffer also shows the same data as the
  first read operation. Since the first read and next re-read shows same
  data, there could be a write operation (of previous rule stanza to
  initialize disk with pattern 007 ) failure on the disk. The same
  miscompare behavior shows for all the blocks in the transfer size.

  /dev/sdb          Jun  2 02:29:43 2015 err=000003b6 sev=2 hxestorage      <<===  device name (/dev/sdb)
  rule_3_13  numopers=     20000  loop=       767  blk=0x70fc len=89088
   min_blkno=0 max_blkno=0x74706daf, RANDOM access
  Seed Values= 37303, 290, 23235
  Data Pattern Seed Values = 37303, 291, 23235
  BWRC LBA fencepost Detail:
  th_num                min_lba                  max_lba      status
       0                 0            1c9be3ff    R
       1          1d1c1b6c            3a3836d7    F
       2          3a3836d8            57545243    F
       3          57545244            74706daf    F
  Miscompare at buffer offset 64 (0x40)                             <<===   miscompare offset (0x40)
  (Flags: badsig=0; cksum=0x60000)  Maximum LBA = 0x74706daf
  wbuf (baseaddr 0x3ffe1c0e6600) b0ffffffffffffffffffffffffffffffffffffff
  rbuf (baseaddr 0x3ffe1c0fc400) 850100fc700200fd700300fe700400ff70050000
  Write buffer saved in /tmp/htxsdb.wbuf1
  Read buffer saved in /tmp/htxsdb.rbuf1
  Re-read fails compare at offset64; buffer saved in /tmp/htxsdb.rerd1
  errno: 950(Unknown error 950)

  Asghar reproduced that HTX hang he is seeing. Looking in the kernel
  logs I see some messages from the kernel that there are user threads
  blocked on getting reads serviced. So likely HTX is seeing the same
  thing. I've asked Asghar to try using the deadline I/O scheduler
  rather than CFQ to see if that makes any difference. If that does not
  make any difference, the next thing to try is reducing the queue depth
  of the device. Right now its 31, which I think is pretty high.

  Step 1:

  echo deadline > /sys/block/sda/queue/scheduler
  echo deadline > /sys/block/sdb/queue/scheduler

  If that reproduces the issue, go to step 2:

  echo cfq > /sys/block/sda/queue/scheduler
  echo cfq > /sys/block/sdb/queue/scheduler
  echo 8 > /sys/block/sda/device/queue_depth
  echo 8 > /sys/block/sdb/device/queue_depth

  Breno - it looks like the default I/O scheduler + default queue depth
  for the SATA disks in Firestone is not optimal, in that when running a
  heavy I/O workload, we see read starvation occurring, which is making
  the system nearly unusable.

  Once we changed the I/O scheduler from cfq to deadline, all the issues
  went away and the system is able to run the same workload yet still be
  responsive. Suggest we either encourage Canonical to change the
  default I/O scheduler to deadline or at the very least provide
  documentation to encourage our customers to make this change
  themselves.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1469829/+subscriptions