kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #63137
[Bug 1323165] Re: [HP ProLiant DL380p Gen8] kernel BUG at /build/buildd/linux-3.13.0/mm/memory.c:3756!
We have some more information which may help.
Once this problem happens, then the system goes into some kind of
unstable state. Existing console or ssh sessions continue working as
long as we don't try to access the java process. However, new SSH
sessions don't start. Logging in from the console leads to the server
information getting displayed but the command prompt does not show up
after that.
The task that we execute on the server has a setting for number of
threads to use and it is set to 16 by default which consistently leads
to this bug after 24-48 hours of processing. We tried to run the same
task with 8 threads and it has been running without any problem for
days.
We have multiple servers with the exact same hardware and software where
we are seeing this bug. We have downgraded one of them to Ubuntu 12.04.0
and that server has been working fine even with 16 threads. We have now
upgraded the other servers to the new kernel released yesterday
(3.13.0-27) and will report back if the issue is fixed there. If not,
then we will try the latest mainline kernel.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1323165
Title:
[HP ProLiant DL380p Gen8] kernel BUG at
/build/buildd/linux-3.13.0/mm/memory.c:3756!
Status in “linux” package in Ubuntu:
Incomplete
Bug description:
The machine becomes non-responsive, unable to ssh, high load average, trying to access the running java process does not work as per syslog:
May 26 06:19:38 server06 kernel: [75831.929529] ------------[ cut here ]------------
May 26 06:19:38 server06 kernel: [75831.930191] kernel BUG at /build/buildd/linux-3.13.0/mm/memory.c:3756!
May 26 06:19:38 server06 kernel: [75831.931129] invalid opcode: 0000 [#1] SMP
May 26 06:19:38 server06 kernel: [75831.931729] Modules linked in: xt_multiport ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables gpio_ich nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw sb_edac edac_core lpc_ich hpwdt hpilo ioatdma lp dca ipmi_si parport acpi_power_meter mac_hid tg3 ptp psmouse hpsa pps_core
May 26 06:19:38 server06 kernel: [75831.941585] CPU: 4 PID: 2930 Comm: java Not tainted 3.13.0-24-generic #47-Ubuntu
May 26 06:19:38 server06 kernel: [75831.942633] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
May 26 06:19:38 server06 kernel: [75831.943583] task: ffff881fe8372fe0 ti: ffff881fe632a000 task.ti: ffff881fe632a000
May 26 06:19:38 server06 kernel: [75831.944654] RIP: 0010:[<ffffffff81179051>] [<ffffffff81179051>] handle_mm_fault+0xe61/0xf10
May 26 06:19:38 server06 kernel: [75831.946137] RSP: 0000:ffff881fe632bd98 EFLAGS: 00010246
May 26 06:19:38 server06 kernel: [75831.946885] RAX: 0000000000000100 RBX: 00007fc37320a370 RCX: ffff881fe632bb18
May 26 06:19:38 server06 kernel: [75831.947902] RDX: ffff881fe8372fe0 RSI: 0000000000000000 RDI: 8000000100c009e6
May 26 06:19:38 server06 kernel: [75831.948932] RBP: ffff881fe632be20 R08: 0000000000000000 R09: 00000000000000a9
May 26 06:19:38 server06 kernel: [75831.949952] R10: 0000000000000001 R11: 0000000000000000 R12: ffff881fd83a7cc8
May 26 06:19:38 server06 kernel: [75831.950961] R13: ffff880fe6787d40 R14: ffff880fe5d95780 R15: 0000000000000080
May 26 06:19:38 server06 kernel: [75831.951985] FS: 00007fc938145700(0000) GS:ffff880fffa80000(0000) knlGS:0000000000000000
May 26 06:19:38 server06 kernel: [75831.976736] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 26 06:19:38 server06 kernel: [75832.005183] CR2: 00007fc373620930 CR3: 0000000fe63fe000 CR4: 00000000000407e0
May 26 06:19:38 server06 kernel: [75832.033473] Stack:
May 26 06:19:38 server06 kernel: [75832.060551] 0000000000000001 ffff881fe632bdb0 ffffffff8109a780 ffff881fe632bdd0
May 26 06:19:38 server06 kernel: [75832.117385] ffffffff810d7ad6 0000000000000001 ffffffff81f1ea20 ffff881fe632be78
May 26 06:19:38 server06 kernel: [75832.173599] ffffffff810d983d ffff881fe632be48 ffff8800000000a9 00000001ffffffff
May 26 06:19:38 server06 kernel: [75832.231813] Call Trace:
May 26 06:19:38 server06 kernel: [75832.258781] [<ffffffff8109a780>] ? wake_up_state+0x10/0x20
May 26 06:19:38 server06 kernel: [75832.286702] [<ffffffff810d7ad6>] ? wake_futex+0x66/0x90
May 26 06:19:38 server06 kernel: [75832.311849] [<ffffffff810d983d>] ? futex_wake_op+0x4ed/0x620
May 26 06:19:38 server06 kernel: [75832.337329] [<ffffffff81721a24>] __do_page_fault+0x184/0x560
May 26 06:19:38 server06 kernel: [75832.363061] [<ffffffff811112fc>] ? acct_account_cputime+0x1c/0x20
May 26 06:19:38 server06 kernel: [75832.387739] [<ffffffff8109d76b>] ? account_user_time+0x8b/0xa0
May 26 06:19:38 server06 kernel: [75832.411608] [<ffffffff8109dd84>] ? vtime_account_user+0x54/0x60
May 26 06:19:38 server06 kernel: [75832.436126] [<ffffffff81721e1a>] do_page_fault+0x1a/0x70
May 26 06:19:38 server06 kernel: [75832.458239] [<ffffffff8171e288>] page_fault+0x28/0x30
May 26 06:19:38 server06 kernel: [75832.481780] Code: ff 48 89 d9 4c 89 e2 4c 89 ee 4c 89 f7 44 89 4d c8 e8 34 c1 ff ff 85 c0 0f 85 94 f5 ff ff 49 8b 3c 24 44 8b 4d c8 e9 68 f3 ff ff <0f> 0b be 8e 00 00 00 48 c7 c7 18 25 a6 81 44 89 4d c8 e8 18 e7
May 26 06:19:38 server06 kernel: [75832.551672] RIP [<ffffffff81179051>] handle_mm_fault+0xe61/0xf10
May 26 06:19:38 server06 kernel: [75832.574254] RSP <ffff881fe632bd98>
May 26 06:19:38 server06 kernel: [75832.630392] ---[ end trace e41b58adf8e0d72b ]---
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-24-generic 3.13.0-24.47
ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 26 09:30 seq
crw-rw---- 1 root audio 116, 33 May 26 09:30 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Mon May 26 10:27:04 2014
HibernationDevice: RESUME=UUID=a777a6ba-8cca-4435-8869-15bd3294ee35
InstallationDate: Installed on 2014-04-30 (25 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: HP ProLiant DL380p Gen8
PciMultimedia:
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=UUID=cf3b3e5c-c6c7-4a56-b471-e1741ecbd865 ro
RelatedPackageVersions:
linux-restricted-modules-3.13.0-24-generic N/A
linux-backports-modules-3.13.0-24-generic N/A
linux-firmware 1.127.2
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 02/10/2014
dmi.bios.vendor: HP
dmi.bios.version: P70
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrP70:bd02/10/2014:svnHP:pnProLiantDL380pGen8:pvr:cvnHP:ct23:cvr:
dmi.product.name: ProLiant DL380p Gen8
dmi.sys.vendor: HP
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1323165/+subscriptions
References