← Back to team overview

kernel-packages team mailing list archive

[Bug 1510196] [NEW] OOM killer causes complete system freezes

 

Public bug reported:

A race condition in the Linux kernel randomly causes complete system
freezes when the system or a cgroup runs out of memory and the OOM
killer is invoked. I ran into this issue twice during the last 2 weeks
(Ubuntu 14.04.3 LTS, kernel version 3.13.0-65.106), in my case it was an
LXC cgroup running out of memory and not the system itself. The system
log files did not show any errors, but there were repeating error
messages on the system console (see attached screenshot).

When searching for these error messages I found the following website
that provides more information about the cause and a possible solution
to fix it: https://community.nitrous.io/posts/stability-and-a-linux-oom-
killer-bug

I did not see any OOM killer related messages in the system logs at the
time the freeze happened, but one of the LXC containers running on that
server went out of memory exactly at that time, so I guess this must be
related.

I was not able to reproduce the error in a testing environment as it
seems to be a race condition between several processes that does not
occur every time. However, I have applied the 3 suggested commits for
mm/oom_kill.c
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?id=4d4048be8a93769350efa31d2482a038b7de73d0&qt=range&q=0c740d0afc3bff0a097ad03a1c8df92757516f5c...4d4048be8a93769350efa31d2482a038b7de73d0)
from kernel 3.14 to the current trusty kernel, built the patched kernel
and it seems to work. Could this bugfix for the OOM killer be backported
to the official trusty kernel?

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-66-generic 3.13.0-66.108
ProcVersionSignature: Ubuntu 3.13.0-66.108-generic 3.13.11-ckt27
Uname: Linux 3.13.0-66-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116,  1 Oct 23 09:12 seq
 crw-rw---- 1 root audio 116, 33 Oct 23 09:12 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.16
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Mon Oct 26 12:19:55 2015
HibernationDevice: RESUME=UUID=728105e7-2f6b-46fd-bf56-1bf67239adea
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: Supermicro H8DGT
PciMultimedia:
 
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-66-generic root=UUID=08db4f0e-59c5-4ffd-a34c-866e3fe126c5 ro nomdmonddf nomdmonisw
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-66-generic N/A
 linux-backports-modules-3.13.0-66-generic  N/A
 linux-firmware                             1.127.15
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 05/07/2013
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3.0a
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: H8DGT-HF
dmi.board.vendor: Supermicro
dmi.board.version: 1.21A
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 17
dmi.chassis.vendor: SGI.COM
dmi.chassis.version: 1234567890
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3.0a:bd05/07/2013:svnSupermicro:pnH8DGT:pvr1234567890:rvnSupermicro:rnH8DGT-HF:rvr1.21A:cvnSGI.COM:ct17:cvr1234567890:
dmi.product.name: H8DGT
dmi.product.version: 1234567890
dmi.sys.vendor: Supermicro

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug third-party-packages trusty

** Attachment added: "console error messages"
   https://bugs.launchpad.net/bugs/1510196/+attachment/4505822/+files/h-0056.jpg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1510196

Title:
  OOM killer causes complete system freezes

Status in linux package in Ubuntu:
  New

Bug description:
  A race condition in the Linux kernel randomly causes complete system
  freezes when the system or a cgroup runs out of memory and the OOM
  killer is invoked. I ran into this issue twice during the last 2 weeks
  (Ubuntu 14.04.3 LTS, kernel version 3.13.0-65.106), in my case it was
  an LXC cgroup running out of memory and not the system itself. The
  system log files did not show any errors, but there were repeating
  error messages on the system console (see attached screenshot).

  When searching for these error messages I found the following website
  that provides more information about the cause and a possible solution
  to fix it: https://community.nitrous.io/posts/stability-and-a-linux-
  oom-killer-bug

  I did not see any OOM killer related messages in the system logs at
  the time the freeze happened, but one of the LXC containers running on
  that server went out of memory exactly at that time, so I guess this
  must be related.

  I was not able to reproduce the error in a testing environment as it
  seems to be a race condition between several processes that does not
  occur every time. However, I have applied the 3 suggested commits for
  mm/oom_kill.c
  (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?id=4d4048be8a93769350efa31d2482a038b7de73d0&qt=range&q=0c740d0afc3bff0a097ad03a1c8df92757516f5c...4d4048be8a93769350efa31d2482a038b7de73d0)
  from kernel 3.14 to the current trusty kernel, built the patched
  kernel and it seems to work. Could this bugfix for the OOM killer be
  backported to the official trusty kernel?

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-3.13.0-66-generic 3.13.0-66.108
  ProcVersionSignature: Ubuntu 3.13.0-66.108-generic 3.13.11-ckt27
  Uname: Linux 3.13.0-66-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Oct 23 09:12 seq
   crw-rw---- 1 root audio 116, 33 Oct 23 09:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3.16
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  Date: Mon Oct 26 12:19:55 2015
  HibernationDevice: RESUME=UUID=728105e7-2f6b-46fd-bf56-1bf67239adea
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Supermicro H8DGT
  PciMultimedia:
   
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-66-generic root=UUID=08db4f0e-59c5-4ffd-a34c-866e3fe126c5 ro nomdmonddf nomdmonisw
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-66-generic N/A
   linux-backports-modules-3.13.0-66-generic  N/A
   linux-firmware                             1.127.15
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 05/07/2013
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 3.0a
  dmi.board.asset.tag: To Be Filled By O.E.M.
  dmi.board.name: H8DGT-HF
  dmi.board.vendor: Supermicro
  dmi.board.version: 1.21A
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 17
  dmi.chassis.vendor: SGI.COM
  dmi.chassis.version: 1234567890
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3.0a:bd05/07/2013:svnSupermicro:pnH8DGT:pvr1234567890:rvnSupermicro:rnH8DGT-HF:rvr1.21A:cvnSGI.COM:ct17:cvr1234567890:
  dmi.product.name: H8DGT
  dmi.product.version: 1234567890
  dmi.sys.vendor: Supermicro

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1510196/+subscriptions


Follow ups