← Back to team overview

kernel-packages team mailing list archive

[Bug 1064521] Re: Kernel I/O scheduling writes starving reads, local DoS

 

** Changed in: linux (Ubuntu Quantal)
       Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1064521

Title:
  Kernel I/O scheduling writes starving reads, local DoS

Status in “linux” package in Ubuntu:
  Triaged
Status in “linux” source package in Precise:
  Triaged
Status in “linux” source package in Quantal:
  Won't Fix

Bug description:
  On the Precise default kernel, it is possible by executing zcat on a
  large file for an unprivileged user to disrupt I/O sufficiently that
  it causes serious disruption.

  Serious disruption means (e.g.) a single MySQL update hangs for over
  120 seconds on the default scheduler (cfq), and between 1 and 11
  seconds on the deadline scheduler.

  This is reproducible on 2 sets of hardware using:

  root@extility-qa-test:~# uname -a
  Linux extility-qa-test 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
     linux-image-3.2.0-29-generic 3.2.0-29.46
  root@extility-qa-test:~# cat /proc/sys/vm/dirty_ratio 
  20
  root@extility-qa-test:~# cat /proc/sys/vm/dirty_background_ratio 
  10

  No such problems occur on Lucid OS running the Oneiric Backports
  kernel.

  root@management-dev2:~# uname -a
  Linux management-dev2 3.0.0-15-server #26~lucid1-Ubuntu SMP Wed Jan 25 15:55:45 UTC 2012 x86_64 GNU/Linux
  linux-image-3.0.0-15-server               3.0.0-15.26~lucid1                              

  
  In order to replicate, download (e.g.) this gzipped Lucid image (note this is not the OS we are running, this is just an example of the a file that causes the problem):
   http://repo.flexiant.com/images/public/kvm/ubuntu10.04.img.gz
  and as un unprivileged user, on a default, untuned Precise install, do
    zcat ubuntu10.04.img.gz > test

  
  Now in another window execute any trivial mysql update on any table. Note that this can take a hugely long time.

  "show full processlist" in mysql console will show the time taken
  executing the command.

  
  In kernel logs (with cfq) we see e.g.:

  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268048] INFO: task mysqld:1358 blocked for more than 120 seconds.
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268144] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268267] mysqld          D ffffffff81806200     0  1358      1 0x00000000
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268272]  ffff8801921fde48 0000000000000082 ffff8801921fde00 0000000300000001
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268278]  ffff8801921fdfd8 ffff8801921fdfd8 ffff8801921fdfd8 0000000000013780
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268283]  ffff880195169700 ffff880191f79700 ffff8801921fde58 ffff8801912b2800
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268288] Call Trace:
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268298]  [<ffffffff816579cf>] schedule+0x3f/0x60
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268303]  [<ffffffff812650d5>] jbd2_log_wait_commit+0xb5/0x130
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268308]  [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268313]  [<ffffffff81211248>] ext4_sync_file+0x208/0x2d0
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268317]  [<ffffffff81177ba0>] ? vfs_write+0x110/0x180
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268321]  [<ffffffff811a63a6>] do_fsync+0x56/0x80
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268325]  [<ffffffff811a66d0>] sys_fsync+0x10/0x20
  Oct  8 14:57:02 extility-qa-test kernel: [ 3840.268329]  [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268176] INFO: task mysqld:1358 blocked for more than 120 seconds.
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268393] mysqld          D ffffffff81806200     0  1358      1 0x00000000
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268399]  ffff8801921fde48 0000000000000082 ffff8801921fde00 0000000300000001
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268405]  ffff8801921fdfd8 ffff8801921fdfd8 ffff8801921fdfd8 0000000000013780
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268410]  ffff880195169700 ffff880191f79700 ffff8801921fde58 ffff8801912b2800
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268415] Call Trace:
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268426]  [<ffffffff816579cf>] schedule+0x3f/0x60
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268431]  [<ffffffff812650d5>] jbd2_log_wait_commit+0xb5/0x130
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268436]  [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268441]  [<ffffffff81211248>] ext4_sync_file+0x208/0x2d0
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268445]  [<ffffffff81177ba0>] ? vfs_write+0x110/0x180
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268450]  [<ffffffff811a63a6>] do_fsync+0x56/0x80
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268454]  [<ffffffff811a66d0>] sys_fsync+0x10/0x20
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268458]  [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268473] INFO: task master:1525 blocked for more than 120 seconds.
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.268563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

  
  This isn't just a problem with MySQL. Here it is killing Postgres:

  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269090] INFO: task postgres:2188 blocked for more than 120 seconds.
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269298] postgres        D ffffffff81806200     0  2188   2186 0x00000000
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269302]  ffff8801919cbe48 0000000000000082 ffff8801919cbe00 0000000300000001
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269307]  ffff8801919cbfd8 ffff8801919cbfd8 ffff8801919cbfd8 0000000000013780
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269312]  ffff880195169700 ffff8801916fae00 ffff8801919cbe58 ffff8801912b2800
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269317] Call Trace:
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269321]  [<ffffffff816579cf>] schedule+0x3f/0x60
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269325]  [<ffffffff812650d5>] jbd2_log_wait_commit+0xb5/0x130
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269328]  [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269332]  [<ffffffff81211248>] ext4_sync_file+0x208/0x2d0
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269335]  [<ffffffff81177ba0>] ? vfs_write+0x110/0x180
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269339]  [<ffffffff811a63a6>] do_fsync+0x56/0x80
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269342]  [<ffffffff811a66d0>] sys_fsync+0x10/0x20
  Oct  8 14:59:02 extility-qa-test kernel: [ 3960.269346]  [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b


  Standard information follows:

  root@extility-qa-test:~# lsb_release -rd
  Description:	Ubuntu 12.04.1 LTS
  Release:	12.04

  root@extility-qa-test:~# apt-cache policy linux-image-3.2.0-29-generic
  linux-image-3.2.0-29-generic:
    Installed: 3.2.0-29.46
    Candidate: 3.2.0-29.46
    Version table:
   *** 3.2.0-29.46 0
          500 http://gb.archive.ubuntu.com/ubuntu/ precise-updates/main amd64 Packages
          500 http://security.ubuntu.com/ubuntu/ precise-security/main amd64 Packages
          100 /var/lib/dpkg/status

  Expected behaviour: same as Lucid, perhaps MySQL slows down a bit, but
  cannot be DoS'd

  What actually happened: local DoS of MySQL.
  --- 
  AlsaDevices:
   total 0
   crw-rw---T 1 root audio 116,  1 Oct  9 10:02 seq
   crw-rw---T 1 root audio 116, 33 Oct  9 10:02 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.0.1-0ubuntu13
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  DistroRelease: Ubuntu 12.04
  HibernationDevice: RESUME=UUID=5cd80c3d-3646-42ed-bf16-528a0151228d
  InstallationMedia: Ubuntu-Server 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120817.3)
  IwConfig: Error: [Errno 2] No such file or directory
  MachineType: Dell Inc. PowerEdge R210 II
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   LANGUAGE=en_GB:en
   TERM=xterm-color
   LANG=en_GB.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-29-generic root=UUID=9daecda1-1f0f-41a0-91b8-5981fdea5c9e ro quiet
  ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
  RelatedPackageVersions:
   linux-restricted-modules-3.2.0-29-generic N/A
   linux-backports-modules-3.2.0-29-generic  N/A
   linux-firmware                            1.79
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  precise
  Uname: Linux 3.2.0-29-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  dmi.bios.date: 03/13/2012
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.0.5
  dmi.board.name: 03X6X0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A00
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: dmi:bvnDellInc.:bvr2.0.5:bd03/13/2012:svnDellInc.:pnPowerEdgeR210II:pvr:rvnDellInc.:rn03X6X0:rvrA00:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R210 II
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1064521/+subscriptions