kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #79952
[Bug 1098961] Re: PAE regression: OOM with just a few sleeps
Now at Ubuntu VERSION="14.04.1 LTS, Trusty Tahr" seems that the
one-and-only x86 kernel offered is with PAE, so I cannot easily
demonstrate the difference between PAE and non-PAE kernels.
With the PAE kernel, my machine produces an OOM crash with the
test command
bash -c 'n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1));
((m=n%500)); if [ $m -lt 1 ]; then echo -n "$n - "; date; free -l; sleep
1; fi; done'
My conjecture is that all Linux machines will produce an OOM,
regardless of the amount of RAM memory installed.
Your challenge is to prove me wrong: find any one machine that
survives the above command without an OOM condition. (Obviously
the bug, and thus the challenge, is for Linux machines with x86
CPUs and PAE kernels; but is for any distros, not just Ubuntu.)
Should you succeed, you will have helped to better understand
this bug and so contribute to finding a solution; I also offer
the prize of a carton of beer to the first finder; and having
knocked me off my soapbox, you will be allowed to feel smug and
superior forever.
Should you fail the challenge and have your machine reproduce
an OOM crash, I urge you to complain to your Linux distributor
or to the kernel people. (Seems that bugs are left unfixed for
years if only a few people complain, to the point where Ubuntu
abandoned the "working version" non-PAE kernel.)
---
Maybe a workaround is to upgrade to 64-bit Linux with amd64
kernel. However... that is just a crude workaround without any
guarantee of correctness, until some understanding of this bug:
the kernel code is common between 32- and 64-bit.
---
Long and boring details below.
Cheers, Paul
Paul Szabo psz@xxxxxxxxxxxxxxxxx http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics University of Sydney Australia
-----
psz@DellE520:~$ dpkg -l | grep linux-image
ii linux-image-3.13.0-35-generic 3.13.0-35.62 i386 Linux kernel image for version 3.13.0 on 32 bit x86 SMP
ii linux-image-extra-3.13.0-35-generic 3.13.0-35.62 i386 Linux kernel extra modules for version 3.13.0 on 32 bit x86 SMP
ii linux-image-generic 3.13.0.35.42 i386 Generic Linux kernel image
psz@DellE520:~$
psz@DellE520:~$ uname -a
Linux DellE520 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:01 UTC 2014 i686 i686 i686 GNU/Linux
psz@DellE520:~$
psz@DellE520:~$ grep -H PAE /boot/conf*
/boot/config-3.13.0-35-generic:CONFIG_X86_PAE=y
psz@DellE520:~$
psz@DellE520:~$ free -l
total used free shared buffers cached
Mem: 3092416 1027832 2064584 142520 125124 519192
Low: 870004 444336 425668
High: 2222412 583496 1638916
-/+ buffers/cache: 383516 2708900
Swap: 20000920 0 20000920
psz@DellE520:~$
psz@DellE520:~$ ulimit -u
23950
psz@DellE520:~$
psz@DellE520:~$ bash -c 'n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1)); ((m=n%500)); if [ $m -lt 1 ]; then echo -n "$n - "; date; free -l; sleep 1; fi; done'
The above produces an OOM as shown in /var/log/syslog :
Sep 15 17:12:54 DellE520 kernel: [ 403.028298] bash invoked oom-killer: gfp_mask=0x2084d0, order=0, oom_score_adj=0
Sep 15 17:12:54 DellE520 kernel: [ 403.028309] bash cpuset=/ mems_allowed=0
Sep 15 17:12:54 DellE520 kernel: [ 403.028317] CPU: 1 PID: 19509 Comm: bash Not tainted 3.13.0-35-generic #62-Ubuntu
Sep 15 17:12:54 DellE520 kernel: [ 403.028320] Hardware name: Dell Inc. Dell DM061 /0WG864, BIOS 2.4.0 05/24/2007
Sep 15 17:12:54 DellE520 kernel: [ 403.028324] 00000000 00000000 eb845d68 c1652f30 eb86b400 eb845dc0 c164f4d9 c18437dc
Sep 15 17:12:54 DellE520 kernel: [ 403.028336] eb86b70c 002084d0 00000000 00000000 f7b8d140 00000000 00000046 00000000
Sep 15 17:12:54 DellE520 kernel: [ 403.028346] 00000000 00000206 c1944560 eb845dc0 e6e90d00 002084d0 00000000 eb845dcc
Sep 15 17:12:54 DellE520 kernel: [ 403.028357] Call Trace:
Sep 15 17:12:54 DellE520 kernel: [ 403.028372] [<c1652f30>] dump_stack+0x41/0x52
Sep 15 17:12:54 DellE520 kernel: [ 403.028378] [<c164f4d9>] dump_header.isra.9+0x76/0x1c4
Sep 15 17:12:54 DellE520 kernel: [ 403.028386] [<c11251d7>] oom_kill_process+0x167/0x2b0
Sep 15 17:12:54 DellE520 kernel: [ 403.028393] [<c12706fc>] ? security_capable_noaudit+0x1c/0x30
Sep 15 17:12:54 DellE520 kernel: [ 403.028399] [<c105fa5a>] ? has_capability_noaudit+0x1a/0x30
Sep 15 17:12:54 DellE520 kernel: [ 403.028405] [<c11256eb>] out_of_memory+0x22b/0x260
Sep 15 17:12:54 DellE520 kernel: [ 403.028410] [<c112a011>] __alloc_pages_nodemask+0x861/0x980
Sep 15 17:12:54 DellE520 kernel: [ 403.028417] [<c112a14c>] __get_free_pages+0x1c/0x40
Sep 15 17:12:54 DellE520 kernel: [ 403.028422] [<c104e3f8>] pgd_alloc+0x38/0x250
Sep 15 17:12:54 DellE520 kernel: [ 403.028428] [<c1053c1e>] mm_init+0xbe/0x100
Sep 15 17:12:54 DellE520 kernel: [ 403.028433] [<c1054103>] mm_alloc+0x53/0xa0
Sep 15 17:12:54 DellE520 kernel: [ 403.028440] [<c117fb2e>] do_execve_common+0x18e/0x5c0
Sep 15 17:12:54 DellE520 kernel: [ 403.028446] [<c118018d>] SyS_execve+0x2d/0x50
Sep 15 17:12:54 DellE520 kernel: [ 403.028452] [<c166128d>] sysenter_do_call+0x12/0x12
Sep 15 17:12:54 DellE520 kernel: [ 403.028456] Mem-Info:
Sep 15 17:12:54 DellE520 kernel: [ 403.028459] DMA per-cpu:
Sep 15 17:12:54 DellE520 kernel: [ 403.028463] CPU 0: hi: 0, btch: 1 usd: 0
Sep 15 17:12:54 DellE520 kernel: [ 403.028466] CPU 1: hi: 0, btch: 1 usd: 0
Sep 15 17:12:54 DellE520 kernel: [ 403.028469] Normal per-cpu:
Sep 15 17:12:54 DellE520 kernel: [ 403.028472] CPU 0: hi: 186, btch: 31 usd: 0
Sep 15 17:12:54 DellE520 kernel: [ 403.028474] CPU 1: hi: 186, btch: 31 usd: 35
Sep 15 17:12:54 DellE520 kernel: [ 403.028477] HighMem per-cpu:
Sep 15 17:12:54 DellE520 kernel: [ 403.028480] CPU 0: hi: 186, btch: 31 usd: 0
Sep 15 17:12:54 DellE520 kernel: [ 403.028482] CPU 1: hi: 186, btch: 31 usd: 71
Sep 15 17:12:54 DellE520 kernel: [ 403.028489] active_anon:257982 inactive_anon:107478 isolated_anon:64
Sep 15 17:12:54 DellE520 kernel: [ 403.028489] active_file:6032 inactive_file:38549 isolated_file:0
Sep 15 17:12:54 DellE520 kernel: [ 403.028489] unevictable:8 dirty:0 writeback:0 unstable:0
Sep 15 17:12:54 DellE520 kernel: [ 403.028489] free:127617 slab_reclaimable:3136 slab_unreclaimable:47160
Sep 15 17:12:54 DellE520 kernel: [ 403.028489] mapped:22503 shmem:34751 pagetables:81384 bounce:0
Sep 15 17:12:54 DellE520 kernel: [ 403.028489] free_cma:0
Sep 15 17:12:54 DellE520 kernel: [ 403.028502] DMA free:4136kB min:788kB low:984kB high:1180kB active_anon:148kB inactive_anon:1812kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:1856kB slab_reclaimable:72kB slab_unreclaimable:2980kB kernel_stack:2296kB pagetables:8kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3156 all_unreclaimable? yes
Sep 15 17:12:54 DellE520 kernel: [ 403.028505] lowmem_reserve[]: 0 833 3003 3003
Sep 15 17:12:54 DellE520 kernel: [ 403.028518] Normal free:42404kB min:42384kB low:52980kB high:63576kB active_anon:89328kB inactive_anon:113816kB active_file:212kB inactive_file:128kB unevictable:0kB isolated(anon):256kB isolated(file):0kB present:897016kB managed:854104kB mlocked:0kB dirty:0kB writeback:0kB mapped:2536kB shmem:116764kB slab_reclaimable:12472kB slab_unreclaimable:185660kB kernel_stack:131496kB pagetables:10720kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:360589 all_unreclaimable? yes
Sep 15 17:12:54 DellE520 kernel: [ 403.028521] lowmem_reserve[]: 0 0 17362 17362
Sep 15 17:12:54 DellE520 kernel: [ 403.028533] HighMem free:463928kB min:512kB low:28112kB high:55712kB active_anon:942452kB inactive_anon:314284kB active_file:23916kB inactive_file:154068kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:2222412kB managed:2222412kB mlocked:32kB dirty:0kB writeback:0kB mapped:87472kB shmem:20384kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:314808kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 15 17:12:54 DellE520 kernel: [ 403.028537] lowmem_reserve[]: 0 0 0 0
Sep 15 17:12:54 DellE520 kernel: [ 403.028543] DMA: 8*4kB (UE) 9*8kB (UE) 6*16kB (UE) 5*32kB (UE) 7*64kB (UE) 2*128kB (U) 2*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (R) 0*4096kB = 4136kB
Sep 15 17:12:54 DellE520 kernel: [ 403.028569] Normal: 1511*4kB (EM) 861*8kB (UEM) 500*16kB (UEM) 265*32kB (UEM) 93*64kB (UEM) 23*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 42404kB
Sep 15 17:12:54 DellE520 kernel: [ 403.028592] HighMem: 0*4kB 1*8kB (M) 1*16kB (U) 1*32kB (U) 350*64kB (UM) 95*128kB (UM) 27*256kB (UM) 5*512kB (UM) 2*1024kB (UM) 2*2048kB (UM) 101*4096kB (MR) = 463928kB
Sep 15 17:12:54 DellE520 kernel: [ 403.028620] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 15 17:12:54 DellE520 kernel: [ 403.028623] 79354 total pagecache pages
Sep 15 17:12:54 DellE520 kernel: [ 403.028626] 0 pages in swap cache
Sep 15 17:12:54 DellE520 kernel: [ 403.028629] Swap cache stats: add 7, delete 7, find 0/0
Sep 15 17:12:54 DellE520 kernel: [ 403.028631] Free swap = 20000892kB
Sep 15 17:12:54 DellE520 kernel: [ 403.028633] Total swap = 20000920kB
Sep 15 17:12:54 DellE520 kernel: [ 403.028636] 783851 pages RAM
Sep 15 17:12:54 DellE520 kernel: [ 403.028639] 555603 pages HighMem/MovableOnly
Sep 15 17:12:54 DellE520 kernel: [ 403.028641] 0 pages reserved
Sep 15 17:12:54 DellE520 kernel: [ 403.028643] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
...
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1098961
Title:
PAE regression: OOM with just a few sleeps
Status in “linux” package in Ubuntu:
Expired
Bug description:
There is a spurious OOM issue with PAE kernel: it will suffer an OOM
crash just by running a few processes.
Please see also
http://bugs.debian.org/695182
and discussion on linux-mm@xxxxxxxxx e.g.
http://marc.info/?l=linux-mm&m=135801969519193&w=2
I wonder whether
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1098342
is related.
The issue is a regression with PAE, reproduced and verified on my
home PC with 3GB RAM.
My PC was running kernel linux-image-3.2.0-35-generic so it showed:
psz@DellE520:~$ uname -a
Linux DellE520 3.2.0-35-generic #55-Ubuntu SMP Wed Dec 5 17:45:18 UTC 2012 i686 i686 i386 GNU/Linux
psz@DellE520:~$ free -l
total used free shared buffers cached
Mem: 3087972 692256 2395716 0 18276 427116
Low: 861464 71372 790092
High: 2226508 620884 1605624
-/+ buffers/cache: 246864 2841108
Swap: 20000920 258364 19742556
Then it handled the "sleep test"
bash -c 'n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1)); ((m=n%500)); if [ $m -lt 1 ]; then echo -n "$n - "; date; free -l; sleep 1; fi; done'
just fine, stopped only by "max user processes" (default setting of
"ulimit -u 23964"), or raising that limit stopped when the machine ran
out of PID space; there was no OOM.
Installing and running the PAE kernel so it showed:
psz@DellE520:~$ uname -a
Linux DellE520 3.2.0-35-generic-pae #55-Ubuntu SMP Wed Dec 5 18:04:39 UTC 2012 i686 i686 i386 GNU/Linux
psz@DellE520:~$ free -l
total used free shared buffers cached
Mem: 3087620 681188 2406432 0 167332 352296
Low: 865208 214080 651128
High: 2222412 467108 1755304
-/+ buffers/cache: 161560 2926060
Swap: 20000920 0 20000920
and re-trying the "sleep test", it ran into OOM after 18000 or so sleeps
and crashed/froze so I had to press the POWER button to recover.
Cheers, Paul
Paul Szabo psz@xxxxxxxxxxxxxxxxx http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics University of Sydney Australia
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
AplayDevices:
**** List of PLAYBACK Hardware Devices ****
card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
ApportVersion: 2.0.1-0ubuntu15.1
Architecture: i386
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC0: psz 2190 F.... pulseaudio
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xdfddc000 irq 45'
Mixer name : 'SigmaTel STAC9227'
Components : 'HDA:83847618,102801dd,00100201'
Controls : 38
Simple ctrls : 21
CurrentDmesg: [ 28.160013] eth0: no IPv6 routers present
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=9d2bf7ac-9b0a-4082-ac45-f4d3c8e32c23
IwConfig:
lo no wireless extensions.
eth0 no wireless extensions.
MachineType: Dell Inc. Dell DM061
MarkForUpload: True
Package: linux (not installed)
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
LANG=C
SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: root=/dev/mapper/isw_cheedcedhh_DadMirroredTB4 ro quiet splash
ProcVersionSignature: Ubuntu 3.2.0-35.55-generic-pae 3.2.34
RelatedPackageVersions:
linux-restricted-modules-3.2.0-35-generic-pae N/A
linux-backports-modules-3.2.0-35-generic-pae N/A
linux-firmware 1.79.1
RfKill:
Tags: precise
Uname: Linux 3.2.0-35-generic-pae i686
UpgradeStatus: Upgraded to precise on 2012-04-27 (260 days ago)
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
WifiSyslog: Jan 13 06:42:46 DellE520 NetworkManager[1384]: <info> Unmanaged Device found; state CONNECTED forced. (see http://bugs.launchpad.net/bugs/191889)
WpaSupplicantLog:
dmi.bios.date: 03/23/2007
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.2.1
dmi.board.name: 0WG864
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 6
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.2.1:bd03/23/2007:svnDellInc.:pnDellDM061:pvr:rvnDellInc.:rn0WG864:rvr:cvnDellInc.:ct6:cvr:
dmi.product.name: Dell DM061
dmi.sys.vendor: Dell Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1098961/+subscriptions