kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #159999
[Bug 1536904] Re: Kdump fails on Ubuntu 16.04 (PowerVM/PowerKVM/BareMetal)
This bug was fixed in the package makedumpfile - 1:1.5.9-4
---------------
makedumpfile (1:1.5.9-4) sid; urgency=medium
* Allow for symlinks to be created for vmlinux files : On Power8
architecture, systems are booting from a vmlinux file. The symlink
/var/lib/kdump/vmlinuz has to point to this file. (LP: #1536904)
* Add functionality to create symlinks for older kernels : If kdump
is installed on systems with more than one kernel package, the smaller
initrd.img file will only be created for the latest kernel. Adding the
'symlinks' functionality will allow for the creation of symlinks to
older kernels. If the smaller initrd.img file is missing in /var/lib/kdump
it will be created beforehand. This will be preempted if kdump is already
loaded. (LP: #1537714)
* Fix kdump-config manpage : add documentation for the propagate option.
(LP: #1538148)
* Improve manpage for kdump-config
-- Louis Bouchard <louis.bouchard@xxxxxxxxxx> Tue, 26 Jan 2016
15:30:48 +0100
** Changed in: makedumpfile (Ubuntu)
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1536904
Title:
Kdump fails on Ubuntu 16.04 (PowerVM/PowerKVM/BareMetal)
Status in makedumpfile package in Ubuntu:
Fix Released
Status in makedumpfile source package in Wily:
Confirmed
Bug description:
== Comment: #0 - ==
---Problem Description---
Kdump fails on Ubuntu 16.04 with Austin adapter(tg3)
Contact Information = hathyaga@xxxxxxxxxx, iranna.ankad@xxxxxxxxxx,mputtash@xxxxxxxxxx
---uname output---
linux ltciofvtr-s822l1 4.3.0-5-generic #16-Ubuntu SMP Wed Dec 16 23:32:23 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
---Additional Hardware Info---
Machine details:
9.47.67.156 (root/ltcnetdd)
Machine Type = 8247-22L
---System Hang---
The system hangs after triggering a crash. Need to reboot to bring it up and functional.
---Debugger---
A debugger is not configured
---Steps to Reproduce---
Steps to follow:
1. apt-get install linux-crashdump
2. apt-get install kdump-tools
3. Edit /etc/default/kdump-tools and change the following:
USE_KDUMP=0 to 1
4. Change the size of the crash kernel in /boot/grub/grub.cfg to crashkernel=4096M-:4096M
5. Load the kdump config file: kdump-config load
6. echo 1 > /proc/sys/kernel/sysrq
7. echo c > /proc/sysrq-trigger
Things to look at to cross-check are:
After loading the kdump-config file, check for it's status
root@ltciofvtr-s822l1:~# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
SSH: root@35.35.35.36
SSH_KEY: /root/.ssh/id_rsa
HOSTTAG: ip
current state: ready to kdump
kexec command:
/sbin/kexec -p --args-linux --command-line="root=UUID=e445a093-4593-4e91-bebb-6968483bf2ea ro quiet splash irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/boot/initrd.img-4.3.0-5-generic /boot/vmlinux-4.3.0-5-generic
root@ltciofvtr-s822l1:~# kdump-config status
* Broken symlink : /var/lib/kdump/vmlinuz: broken symbolic link to /boot/vmlinuz-4.3.0-5-generic
current state : ready to kdump
root@ltciofvtr-s822l1:~# cat /proc/cmdline
root=UUID=e445a093-4593-4e91-bebb-6968483bf2ea ro quiet splash crashkernel=4096M-:4096M
root@ltciofvtr-s822l1:~# dmesg| grep -i crash
[ 0.000000] Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 131072MB)
[ 0.000000] Kernel command line: root=UUID=e445a093-4593-4e91-bebb-6968483bf2ea ro quiet splash crashkernel=4096M-:4096M
Observations:
1. Kdump-config status command reports that there is a broken symbloic link suggesting that kdump-config file is unable to handle the symbolic link.
2. Trace observed on console:
root@ltciofvtr-s822l1:~# echo c | tee /proc/sysrq-trigger
c
[ 238.872102] sysrq: SysRq : Trigger a crash
[ 238.872179] Unable to handle kernel paging request for data at address 0x00000000
[ 238.872256] Faulting instruction address: 0xc000000000646534
[ 238.872322] Oops: Kernel access of bad area, sig: 11 [#1]
[ 238.872373] SMP NR_CPUS=2048 NUMA PowerNV
[ 238.872427] Modules linked in: dm_round_robin dm_service_time ipmi_powernv ipmi_msghandler leds_powernv uio_pdrv_genirq powernv_rng uio dm_multipath sunrpc bonding autofs4 btrfs xor raid6_pq mlx4_en ses enclosure bnx2x mlx4_core lpfc qla2xxx mdio libcrc32c be2net e1000e vxlan ipr ip6_udp_tunnel udp_tunnel scsi_transport_fc
[ 238.872895] CPU: 121 PID: 3861 Comm: tee Not tainted 4.3.0-5-generic #16-Ubuntu
[ 238.872973] task: c000000fe01ce860 ti: c000000fe022c000 task.ti: c000000fe022c000
[ 238.873049] NIP: c000000000646534 LR: c0000000006475f8 CTR: c000000000646500
[ 238.873125] REGS: c000000fe022f990 TRAP: 0300 Not tainted (4.3.0-5-generic)
[ 238.873200] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28004222 XER: 20000000
[ 238.873392] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
GPR00: c0000000006475f8 c000000fe022fc10 c00000000155e400 0000000000000063
GPR04: c0000007fc648450 c0000007fc659cf0 c000001fff830000 0000000000000792
GPR08: 0000000000000007 0000000000000001 0000000000000000 c000001fff861780
GPR12: c000000000646500 c000000007b87d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000010009d88 0000000000000001
GPR24: 0000000010009d88 00003fffe7b210b0 c0000000014a5cb0 0000000000000004
GPR28: c0000000014a6070 0000000000000063 c000000001460de4 0000000000000000
[ 238.875062] NIP [c000000000646534] sysrq_handle_crash+0x34/0x50
[ 238.875178] LR [c0000000006475f8] __handle_sysrq+0xe8/0x280
[ 238.875270] Call Trace:
[ 238.875322] [c000000fe022fc10] [c000000000dc92a0] _fw_tigon_tg3_bin_name+0x2c5d0/0x33708 (unreliable)
[ 238.875516] [c000000fe022fc30] [c0000000006475f8] __handle_sysrq+0xe8/0x280
[ 238.875658] [c000000fe022fcd0] [c000000000647da8] write_sysrq_trigger+0x78/0xa0
[ 238.875820] [c000000fe022fd00] [c00000000036bf50] proc_reg_write+0xb0/0x110
[ 238.875963] [c000000fe022fd50] [c0000000002d45bc] __vfs_write+0x6c/0xe0
[ 238.876104] [c000000fe022fd90] [c0000000002d52f0] vfs_write+0xc0/0x230
[ 238.876246] [c000000fe022fde0] [c0000000002d632c] SyS_write+0x6c/0x110
[ 238.876389] [c000000fe022fe30] [c000000000009204] system_call+0x38/0xb4
[ 238.876525] Instruction dump:
[ 238.876601] 38427f00 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001a 3949aae4
[ 238.876843] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 238.877091] ---[ end trace 2028716a4fb3f0e5 ]---
[ 238.880521]
[ 238.880590] Sending IPI to other CPUs
[ 238.881716] IPI complete
The system hang is observed here.
3. No crash dump generated after a reboot.
4. Kdump hang also observed on kvm ,PowerVM as well open power
Stack trace output:
[ 238.875270] Call Trace:
[ 238.875322] [c000000fe022fc10] [c000000000dc92a0] _fw_tigon_tg3_bin_name+0x2c5d0/0x33708 (unreliable)
[ 238.875516] [c000000fe022fc30] [c0000000006475f8] __handle_sysrq+0xe8/0x280
[ 238.875658] [c000000fe022fcd0] [c000000000647da8] write_sysrq_trigger+0x78/0xa0
[ 238.875820] [c000000fe022fd00] [c00000000036bf50] proc_reg_write+0xb0/0x110
[ 238.875963] [c000000fe022fd50] [c0000000002d45bc] __vfs_write+0x6c/0xe0
[ 238.876104] [c000000fe022fd90] [c0000000002d52f0] vfs_write+0xc0/0x230
[ 238.876246] [c000000fe022fde0] [c0000000002d632c] SyS_write+0x6c/0x110
[ 238.876389] [c000000fe022fe30] [c000000000009204] system_call+0x38/0xb4
[ 238.876525] Instruction dump:
[ 238.876601] 38427f00 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001a 3949aae4
[ 238.876843] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 238.877091] ---[ end trace 2028716a4fb3f0e5 ]---
Oops output:
no
System Dump Location:
No dump generated
*Additional Instructions for hathyaga@xxxxxxxxxx, iranna.ankad@xxxxxxxxxx,mputtash@xxxxxxxxxx:
-Post a private note with access information to the machine that the bug is occuring on.
-Attach sysctl -a output output to the bug.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1536904/+subscriptions