kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #53866
[Bug 1301496] Re: kernel crash: Unable to handle kernel paging request for data
fwiw, I've investigated the dpkg segfaults, and seen the following:
$ gdb dpkg
GNU gdb (Ubuntu 7.7-0ubuntu3) 7.7
[...]
Reading symbols from dpkg...Reading symbols from /usr/lib/debug//usr/bin/dpkg...done.
done.
(gdb) run -l
Starting program: /usr/bin/dpkg -l
Program received signal SIGSEGV, Segmentation fault.
filesdbinit () at ../../src/filesdb.c:571
571 ../../src/filesdb.c: No such file or directory.
(gdb) print bins
$1 = {0x0 <repeats 9441 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 8191 times>, 0x10000,
0x0 <repeats 8191 times>, 0x10000, 0x0 <repeats 6942 times>}
(gdb)
On a healthy system, this looks like:
(gdb) break filesdbinit
Breakpoint 2 at 0x10003338: file ../../src/filesdb.c, line 565.
(gdb) print bins
$12 = {0x0 <repeats 131072 times>}
(gdb)
Note that bins is an array of pointers.
(gdb) print sizeof(bins[0])
$6 = 8
(gdb)
So once every 8192 elements, there's a wrong bit in the array; 8192*8 is
64k of memory.
This could be a bug in any of the kernel, qemu, or the underlying host.
Note that after a reboot of wolfe, the VMs are reported to be stable
again for the past 72 hours (!). So it's possible this points to a bug
with the host OS/kernel.
There is a second P7 system, postal, which has been exhibiting the same
kinds of problems as wolfe. Adam can speak to this in more detail, and
facilitate any necessary diagnostics on postal.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1301496
Title:
kernel crash: Unable to handle kernel paging request for data
Status in “linux” package in Ubuntu:
Confirmed
Bug description:
We've seen this happen twice now on ppc64el guests that are probably
under load. I don't have a lot of the details on what was going on
when they failed, but I have the stack traces.
[101168.836780] Unable to handle kernel paging request for data at address 0x00010001
[101168.836886] Faulting instruction address: 0xc000000000954b60
[101168.836934] Oops: Kernel access of bad area, sig: 11 [#1]
[101168.836971] SMP NR_CPUS=2048 NUMA pSeries
[101168.837020] Modules linked in: veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables dm_crypt
[101168.837234] CPU: 1 PID: 19760 Comm: kworker/u4:0 Not tainted 3.13.0-8-generic #28-Ubuntu
[101168.837294] Workqueue: netns .cleanup_net
[101168.837332] task: c0000003f99d43e0 ti: c0000001cce44000 task.ti: c0000001cce44000
[101168.837386] NIP: c000000000954b60 LR: c000000000954b68 CTR: c000000000954b00
[101168.837439] REGS: c0000001cce47760 TRAP: 0300 Not tainted (3.13.0-8-generic)
[101168.837493] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002024 XER: 00000000
[101168.837620] CFAR: 000000001063ea4c DAR: 0000000000010001 DSISR: 40000000 SOFTE: 1
GPR00: c000000000954b68 c0000001cce479e0 c0000000010b0dd0 0000000000010001
GPR04: f0000000099918f0 c0000002be072380 c000000000954b68 c0000003fe023508
GPR08: 0000000000010000 c000000209fc0000 000000000000000e 0000000000000001
GPR12: 0000000044002028 c00000000fe80300 c0000000000c3f00 c0000002be1e8bc0
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000001 c000000000f630fc
GPR24: 0000000000000001 fffffffffffffef7 0000000000000000 c000000000f58638
GPR28: 0000000000000001 c0000003fbdc0000 0000000000002000 0000000000000000
[101168.838355] NIP [c000000000954b60] .tcp_net_metrics_exit+0x60/0x110
[101168.838402] LR [c000000000954b68] .tcp_net_metrics_exit+0x68/0x110
[101168.838448] Call Trace:
[101168.838469] [c0000001cce479e0] [c000000000954b68] .tcp_net_metrics_exit+0x68/0x110 (unreliable)
[101168.838542] [c0000001cce47a70] [c0000000008cc49c] .ops_exit_list.isra.2+0x6c/0xd0
[101168.838605] [c0000001cce47b00] [c0000000008ccef0] .cleanup_net+0x150/0x250
[101168.838662] [c0000001cce47bc0] [c0000000000b9e28] .process_one_work+0x1a8/0x4d0
[101168.838726] [c0000001cce47c60] [c0000000000baaf0] .worker_thread+0x180/0x4a0
[101168.838783] [c0000001cce47d30] [c0000000000c4010] .kthread+0x110/0x130
[101168.838841] [c0000001cce47e30] [c00000000000a160] .ret_from_kernel_thread+0x5c/0x7c
[101168.838903] Instruction dump:
[101168.838940] 7d295030 2f890000 e93d0288 419e0058 3bc00000 3b800001 60000000 60420000
[101168.839031] 7bc81f24 7c69402a 2fa30000 419e0024 <ebe30000> 4b8b809d 60000000 2fbf0000
[101168.839127] ---[ end trace fb028b2b5c006a6a ]---
---
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14-0ubuntu1
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 14.04
Lspci:
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinux-3.13.0-19-generic root=UUID=19eaa2f9-0f24-49b9-ba48-24879242481c ro console=hvc0 earlyprintk
ProcVersionSignature: User Name 3.13.0-19.40-generic 3.13.6
RelatedPackageVersions:
linux-restricted-modules-3.13.0-19-generic N/A
linux-backports-modules-3.13.0-19-generic N/A
linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty uec-images
Uname: Linux 3.13.0-19-generic ppc64le
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy netdev plugdev sudo video
WifiSyslog:
_MarkForUpload: True
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1301496/+subscriptions
References