← Back to team overview

kernel-packages team mailing list archive

[Bug 1098961] Re: PAE regression: OOM with just a few sleeps

 

Now at Ubuntu VERSION="14.04.1 LTS, Trusty Tahr" seems that the
one-and-only x86 kernel offered is with PAE, so I cannot easily
demonstrate the difference between PAE and non-PAE kernels.

With the PAE kernel, my machine produces an OOM crash with the
test command

  bash -c 'n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1));
((m=n%500)); if [ $m -lt 1 ]; then echo -n "$n - "; date; free -l; sleep
1; fi; done'

My conjecture is that all Linux machines will produce an OOM,
regardless of the amount of RAM memory installed.

Your challenge is to prove me wrong: find any one machine that
survives the above command without an OOM condition. (Obviously
the bug, and thus the challenge, is for Linux machines with x86
CPUs and PAE kernels; but is for any distros, not just Ubuntu.)
Should you succeed, you will have helped to better understand
this bug and so contribute to finding a solution; I also offer
the prize of a carton of beer to the first finder; and having
knocked me off my soapbox, you will be allowed to feel smug and
superior forever.

Should you fail the challenge and have your machine reproduce
an OOM crash, I urge you to complain to your Linux distributor
or to the kernel people. (Seems that bugs are left unfixed for
years if only a few people complain, to the point where Ubuntu
abandoned the "working version" non-PAE kernel.)

---

Maybe a workaround is to upgrade to 64-bit Linux with amd64
kernel. However... that is just a crude workaround without any
guarantee of correctness, until some understanding of this bug:
the kernel code is common between 32- and 64-bit.

---

Long and boring details below.

Cheers, Paul

Paul Szabo   psz@xxxxxxxxxxxxxxxxx   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia


-----

psz@DellE520:~$ dpkg -l | grep linux-image
ii  linux-image-3.13.0-35-generic                               3.13.0-35.62                                        i386         Linux kernel image for version 3.13.0 on 32 bit x86 SMP
ii  linux-image-extra-3.13.0-35-generic                         3.13.0-35.62                                        i386         Linux kernel extra modules for version 3.13.0 on 32 bit x86 SMP
ii  linux-image-generic                                         3.13.0.35.42                                        i386         Generic Linux kernel image
psz@DellE520:~$ 
psz@DellE520:~$ uname -a
Linux DellE520 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:01 UTC 2014 i686 i686 i686 GNU/Linux
psz@DellE520:~$ 
psz@DellE520:~$ grep -H PAE /boot/conf*
/boot/config-3.13.0-35-generic:CONFIG_X86_PAE=y
psz@DellE520:~$ 
psz@DellE520:~$ free -l
             total       used       free     shared    buffers     cached
Mem:       3092416    1027832    2064584     142520     125124     519192
Low:        870004     444336     425668
High:      2222412     583496    1638916
-/+ buffers/cache:     383516    2708900
Swap:     20000920          0   20000920
psz@DellE520:~$ 
psz@DellE520:~$ ulimit -u
23950
psz@DellE520:~$ 
psz@DellE520:~$ bash -c 'n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1)); ((m=n%500)); if [ $m -lt 1 ]; then echo -n "$n - "; date; free -l; sleep 1; fi; done'

The above produces an OOM as shown in /var/log/syslog :
Sep 15 17:12:54 DellE520 kernel: [  403.028298] bash invoked oom-killer: gfp_mask=0x2084d0, order=0, oom_score_adj=0
Sep 15 17:12:54 DellE520 kernel: [  403.028309] bash cpuset=/ mems_allowed=0
Sep 15 17:12:54 DellE520 kernel: [  403.028317] CPU: 1 PID: 19509 Comm: bash Not tainted 3.13.0-35-generic #62-Ubuntu
Sep 15 17:12:54 DellE520 kernel: [  403.028320] Hardware name: Dell Inc.                 Dell DM061                   /0WG864, BIOS 2.4.0  05/24/2007
Sep 15 17:12:54 DellE520 kernel: [  403.028324]  00000000 00000000 eb845d68 c1652f30 eb86b400 eb845dc0 c164f4d9 c18437dc
Sep 15 17:12:54 DellE520 kernel: [  403.028336]  eb86b70c 002084d0 00000000 00000000 f7b8d140 00000000 00000046 00000000
Sep 15 17:12:54 DellE520 kernel: [  403.028346]  00000000 00000206 c1944560 eb845dc0 e6e90d00 002084d0 00000000 eb845dcc
Sep 15 17:12:54 DellE520 kernel: [  403.028357] Call Trace:
Sep 15 17:12:54 DellE520 kernel: [  403.028372]  [<c1652f30>] dump_stack+0x41/0x52
Sep 15 17:12:54 DellE520 kernel: [  403.028378]  [<c164f4d9>] dump_header.isra.9+0x76/0x1c4
Sep 15 17:12:54 DellE520 kernel: [  403.028386]  [<c11251d7>] oom_kill_process+0x167/0x2b0
Sep 15 17:12:54 DellE520 kernel: [  403.028393]  [<c12706fc>] ? security_capable_noaudit+0x1c/0x30
Sep 15 17:12:54 DellE520 kernel: [  403.028399]  [<c105fa5a>] ? has_capability_noaudit+0x1a/0x30
Sep 15 17:12:54 DellE520 kernel: [  403.028405]  [<c11256eb>] out_of_memory+0x22b/0x260
Sep 15 17:12:54 DellE520 kernel: [  403.028410]  [<c112a011>] __alloc_pages_nodemask+0x861/0x980
Sep 15 17:12:54 DellE520 kernel: [  403.028417]  [<c112a14c>] __get_free_pages+0x1c/0x40
Sep 15 17:12:54 DellE520 kernel: [  403.028422]  [<c104e3f8>] pgd_alloc+0x38/0x250
Sep 15 17:12:54 DellE520 kernel: [  403.028428]  [<c1053c1e>] mm_init+0xbe/0x100
Sep 15 17:12:54 DellE520 kernel: [  403.028433]  [<c1054103>] mm_alloc+0x53/0xa0
Sep 15 17:12:54 DellE520 kernel: [  403.028440]  [<c117fb2e>] do_execve_common+0x18e/0x5c0
Sep 15 17:12:54 DellE520 kernel: [  403.028446]  [<c118018d>] SyS_execve+0x2d/0x50
Sep 15 17:12:54 DellE520 kernel: [  403.028452]  [<c166128d>] sysenter_do_call+0x12/0x12
Sep 15 17:12:54 DellE520 kernel: [  403.028456] Mem-Info:
Sep 15 17:12:54 DellE520 kernel: [  403.028459] DMA per-cpu:
Sep 15 17:12:54 DellE520 kernel: [  403.028463] CPU    0: hi:    0, btch:   1 usd:   0
Sep 15 17:12:54 DellE520 kernel: [  403.028466] CPU    1: hi:    0, btch:   1 usd:   0
Sep 15 17:12:54 DellE520 kernel: [  403.028469] Normal per-cpu:
Sep 15 17:12:54 DellE520 kernel: [  403.028472] CPU    0: hi:  186, btch:  31 usd:   0
Sep 15 17:12:54 DellE520 kernel: [  403.028474] CPU    1: hi:  186, btch:  31 usd:  35
Sep 15 17:12:54 DellE520 kernel: [  403.028477] HighMem per-cpu:
Sep 15 17:12:54 DellE520 kernel: [  403.028480] CPU    0: hi:  186, btch:  31 usd:   0
Sep 15 17:12:54 DellE520 kernel: [  403.028482] CPU    1: hi:  186, btch:  31 usd:  71
Sep 15 17:12:54 DellE520 kernel: [  403.028489] active_anon:257982 inactive_anon:107478 isolated_anon:64
Sep 15 17:12:54 DellE520 kernel: [  403.028489]  active_file:6032 inactive_file:38549 isolated_file:0
Sep 15 17:12:54 DellE520 kernel: [  403.028489]  unevictable:8 dirty:0 writeback:0 unstable:0
Sep 15 17:12:54 DellE520 kernel: [  403.028489]  free:127617 slab_reclaimable:3136 slab_unreclaimable:47160
Sep 15 17:12:54 DellE520 kernel: [  403.028489]  mapped:22503 shmem:34751 pagetables:81384 bounce:0
Sep 15 17:12:54 DellE520 kernel: [  403.028489]  free_cma:0
Sep 15 17:12:54 DellE520 kernel: [  403.028502] DMA free:4136kB min:788kB low:984kB high:1180kB active_anon:148kB inactive_anon:1812kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:1856kB slab_reclaimable:72kB slab_unreclaimable:2980kB kernel_stack:2296kB pagetables:8kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3156 all_unreclaimable? yes
Sep 15 17:12:54 DellE520 kernel: [  403.028505] lowmem_reserve[]: 0 833 3003 3003
Sep 15 17:12:54 DellE520 kernel: [  403.028518] Normal free:42404kB min:42384kB low:52980kB high:63576kB active_anon:89328kB inactive_anon:113816kB active_file:212kB inactive_file:128kB unevictable:0kB isolated(anon):256kB isolated(file):0kB present:897016kB managed:854104kB mlocked:0kB dirty:0kB writeback:0kB mapped:2536kB shmem:116764kB slab_reclaimable:12472kB slab_unreclaimable:185660kB kernel_stack:131496kB pagetables:10720kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:360589 all_unreclaimable? yes
Sep 15 17:12:54 DellE520 kernel: [  403.028521] lowmem_reserve[]: 0 0 17362 17362
Sep 15 17:12:54 DellE520 kernel: [  403.028533] HighMem free:463928kB min:512kB low:28112kB high:55712kB active_anon:942452kB inactive_anon:314284kB active_file:23916kB inactive_file:154068kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:2222412kB managed:2222412kB mlocked:32kB dirty:0kB writeback:0kB mapped:87472kB shmem:20384kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:314808kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 15 17:12:54 DellE520 kernel: [  403.028537] lowmem_reserve[]: 0 0 0 0
Sep 15 17:12:54 DellE520 kernel: [  403.028543] DMA: 8*4kB (UE) 9*8kB (UE) 6*16kB (UE) 5*32kB (UE) 7*64kB (UE) 2*128kB (U) 2*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (R) 0*4096kB = 4136kB
Sep 15 17:12:54 DellE520 kernel: [  403.028569] Normal: 1511*4kB (EM) 861*8kB (UEM) 500*16kB (UEM) 265*32kB (UEM) 93*64kB (UEM) 23*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 42404kB
Sep 15 17:12:54 DellE520 kernel: [  403.028592] HighMem: 0*4kB 1*8kB (M) 1*16kB (U) 1*32kB (U) 350*64kB (UM) 95*128kB (UM) 27*256kB (UM) 5*512kB (UM) 2*1024kB (UM) 2*2048kB (UM) 101*4096kB (MR) = 463928kB
Sep 15 17:12:54 DellE520 kernel: [  403.028620] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 15 17:12:54 DellE520 kernel: [  403.028623] 79354 total pagecache pages
Sep 15 17:12:54 DellE520 kernel: [  403.028626] 0 pages in swap cache
Sep 15 17:12:54 DellE520 kernel: [  403.028629] Swap cache stats: add 7, delete 7, find 0/0
Sep 15 17:12:54 DellE520 kernel: [  403.028631] Free swap  = 20000892kB
Sep 15 17:12:54 DellE520 kernel: [  403.028633] Total swap = 20000920kB
Sep 15 17:12:54 DellE520 kernel: [  403.028636] 783851 pages RAM
Sep 15 17:12:54 DellE520 kernel: [  403.028639] 555603 pages HighMem/MovableOnly
Sep 15 17:12:54 DellE520 kernel: [  403.028641] 0 pages reserved
Sep 15 17:12:54 DellE520 kernel: [  403.028643] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1098961

Title:
  PAE regression: OOM with just a few sleeps

Status in “linux” package in Ubuntu:
  Expired

Bug description:
  There is a spurious OOM issue with PAE kernel: it will suffer an OOM
  crash just by running a few processes.
  Please see also
    http://bugs.debian.org/695182
  and discussion on linux-mm@xxxxxxxxx e.g.
    http://marc.info/?l=linux-mm&m=135801969519193&w=2
  I wonder whether
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1098342
  is related.

  The issue is a regression with PAE, reproduced and verified on my
  home PC with 3GB RAM.

  My PC was running kernel linux-image-3.2.0-35-generic so it showed:
    psz@DellE520:~$ uname -a
    Linux DellE520 3.2.0-35-generic #55-Ubuntu SMP Wed Dec 5 17:45:18 UTC 2012 i686 i686 i386 GNU/Linux
    psz@DellE520:~$ free -l
                 total       used       free     shared    buffers     cached
    Mem:       3087972     692256    2395716          0      18276     427116
    Low:        861464      71372     790092
    High:      2226508     620884    1605624
    -/+ buffers/cache:     246864    2841108
    Swap:     20000920     258364   19742556
  Then it handled the "sleep test"
    bash -c 'n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1)); ((m=n%500)); if [ $m -lt 1 ]; then echo -n "$n - "; date; free -l; sleep 1; fi; done'
  just fine, stopped only by "max user processes" (default setting of
  "ulimit -u 23964"), or raising that limit stopped when the machine ran
  out of PID space; there was no OOM.

  Installing and running the PAE kernel so it showed:
    psz@DellE520:~$ uname -a
    Linux DellE520 3.2.0-35-generic-pae #55-Ubuntu SMP Wed Dec 5 18:04:39 UTC 2012 i686 i686 i386 GNU/Linux
    psz@DellE520:~$ free -l
                 total       used       free     shared    buffers     cached
    Mem:       3087620     681188    2406432          0     167332     352296
    Low:        865208     214080     651128
    High:      2222412     467108    1755304
    -/+ buffers/cache:     161560    2926060
    Swap:     20000920          0   20000920
  and re-trying the "sleep test", it ran into OOM after 18000 or so sleeps
  and crashed/froze so I had to press the POWER button to recover.

  Cheers, Paul

  Paul Szabo   psz@xxxxxxxxxxxxxxxxx   http://www.maths.usyd.edu.au/u/psz/
  School of Mathematics and Statistics   University of Sydney    Australia
  --- 
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
  AplayDevices:
   **** List of PLAYBACK Hardware Devices ****
   card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
     Subdevices: 1/1
     Subdevice #0: subdevice #0
  ApportVersion: 2.0.1-0ubuntu15.1
  Architecture: i386
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  psz        2190 F.... pulseaudio
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
  Card0.Amixer.info:
   Card hw:0 'Intel'/'HDA Intel at 0xdfddc000 irq 45'
     Mixer name	: 'SigmaTel STAC9227'
     Components	: 'HDA:83847618,102801dd,00100201'
     Controls      : 38
     Simple ctrls  : 21
  CurrentDmesg: [   28.160013] eth0: no IPv6 routers present
  DistroRelease: Ubuntu 12.04
  HibernationDevice: RESUME=UUID=9d2bf7ac-9b0a-4082-ac45-f4d3c8e32c23
  IwConfig:
   lo        no wireless extensions.
   
   eth0      no wireless extensions.
  MachineType: Dell Inc. Dell DM061
  MarkForUpload: True
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   LANG=C
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: root=/dev/mapper/isw_cheedcedhh_DadMirroredTB4 ro quiet splash
  ProcVersionSignature: Ubuntu 3.2.0-35.55-generic-pae 3.2.34
  RelatedPackageVersions:
   linux-restricted-modules-3.2.0-35-generic-pae N/A
   linux-backports-modules-3.2.0-35-generic-pae  N/A
   linux-firmware                                1.79.1
  RfKill:
   
  Tags:  precise
  Uname: Linux 3.2.0-35-generic-pae i686
  UpgradeStatus: Upgraded to precise on 2012-04-27 (260 days ago)
  UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
  WifiSyslog: Jan 13 06:42:46 DellE520 NetworkManager[1384]: <info> Unmanaged Device found; state CONNECTED forced. (see http://bugs.launchpad.net/bugs/191889)
  WpaSupplicantLog:
   
  dmi.bios.date: 03/23/2007
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.2.1
  dmi.board.name: 0WG864
  dmi.board.vendor: Dell Inc.
  dmi.chassis.type: 6
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: dmi:bvnDellInc.:bvr2.2.1:bd03/23/2007:svnDellInc.:pnDellDM061:pvr:rvnDellInc.:rn0WG864:rvr:cvnDellInc.:ct6:cvr:
  dmi.product.name: Dell DM061
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1098961/+subscriptions