kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #157858
[Bug 1536904] Re: Kdump fails on Ubuntu 16.04 (PowerVM/PowerKVM/BareMetal)
Hello,
Sorry for the delay in replying.
This is definitively a regression following the implementation of
smaller initrd. I am currently working at fixing this. Your second
problem might be caused by not using smaller initrd so I would suggest
to wait to test the fix for this.
I can have a test package available quickly if you have the possibility
of testing from a PPA (according to a previous bug I think you do) so
let me know & I'll tell you where to find the PPA.
** Changed in: makedumpfile (Ubuntu)
Status: Invalid => In Progress
** Changed in: makedumpfile (Ubuntu)
Importance: Undecided => High
** Changed in: makedumpfile (Ubuntu)
Assignee: Taco Screen team (taco-screen-team) => Louis Bouchard (louis-bouchard)
** Also affects: makedumpfile (Ubuntu Wily)
Importance: Undecided
Status: New
** Changed in: makedumpfile (Ubuntu Wily)
Status: New => Confirmed
** Changed in: makedumpfile (Ubuntu Wily)
Importance: Undecided => High
** Changed in: makedumpfile (Ubuntu Wily)
Assignee: (unassigned) => Louis Bouchard (louis-bouchard)
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1536904
Title:
Kdump fails on Ubuntu 16.04 (PowerVM/PowerKVM/BareMetal)
Status in makedumpfile package in Ubuntu:
In Progress
Status in makedumpfile source package in Wily:
Confirmed
Bug description:
== Comment: #0 - ==
---Problem Description---
Kdump fails on Ubuntu 16.04 with Austin adapter(tg3)
Contact Information = hathyaga@xxxxxxxxxx, iranna.ankad@xxxxxxxxxx,mputtash@xxxxxxxxxx
---uname output---
linux ltciofvtr-s822l1 4.3.0-5-generic #16-Ubuntu SMP Wed Dec 16 23:32:23 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
---Additional Hardware Info---
Machine details:
9.47.67.156 (root/ltcnetdd)
Machine Type = 8247-22L
---System Hang---
The system hangs after triggering a crash. Need to reboot to bring it up and functional.
---Debugger---
A debugger is not configured
---Steps to Reproduce---
Steps to follow:
1. apt-get install linux-crashdump
2. apt-get install kdump-tools
3. Edit /etc/default/kdump-tools and change the following:
USE_KDUMP=0 to 1
4. Change the size of the crash kernel in /boot/grub/grub.cfg to crashkernel=4096M-:4096M
5. Load the kdump config file: kdump-config load
6. echo 1 > /proc/sys/kernel/sysrq
7. echo c > /proc/sysrq-trigger
Things to look at to cross-check are:
After loading the kdump-config file, check for it's status
root@ltciofvtr-s822l1:~# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
SSH: root@35.35.35.36
SSH_KEY: /root/.ssh/id_rsa
HOSTTAG: ip
current state: ready to kdump
kexec command:
/sbin/kexec -p --args-linux --command-line="root=UUID=e445a093-4593-4e91-bebb-6968483bf2ea ro quiet splash irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/boot/initrd.img-4.3.0-5-generic /boot/vmlinux-4.3.0-5-generic
root@ltciofvtr-s822l1:~# kdump-config status
* Broken symlink : /var/lib/kdump/vmlinuz: broken symbolic link to /boot/vmlinuz-4.3.0-5-generic
current state : ready to kdump
root@ltciofvtr-s822l1:~# cat /proc/cmdline
root=UUID=e445a093-4593-4e91-bebb-6968483bf2ea ro quiet splash crashkernel=4096M-:4096M
root@ltciofvtr-s822l1:~# dmesg| grep -i crash
[ 0.000000] Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 131072MB)
[ 0.000000] Kernel command line: root=UUID=e445a093-4593-4e91-bebb-6968483bf2ea ro quiet splash crashkernel=4096M-:4096M
Observations:
1. Kdump-config status command reports that there is a broken symbloic link suggesting that kdump-config file is unable to handle the symbolic link.
2. Trace observed on console:
root@ltciofvtr-s822l1:~# echo c | tee /proc/sysrq-trigger
c
[ 238.872102] sysrq: SysRq : Trigger a crash
[ 238.872179] Unable to handle kernel paging request for data at address 0x00000000
[ 238.872256] Faulting instruction address: 0xc000000000646534
[ 238.872322] Oops: Kernel access of bad area, sig: 11 [#1]
[ 238.872373] SMP NR_CPUS=2048 NUMA PowerNV
[ 238.872427] Modules linked in: dm_round_robin dm_service_time ipmi_powernv ipmi_msghandler leds_powernv uio_pdrv_genirq powernv_rng uio dm_multipath sunrpc bonding autofs4 btrfs xor raid6_pq mlx4_en ses enclosure bnx2x mlx4_core lpfc qla2xxx mdio libcrc32c be2net e1000e vxlan ipr ip6_udp_tunnel udp_tunnel scsi_transport_fc
[ 238.872895] CPU: 121 PID: 3861 Comm: tee Not tainted 4.3.0-5-generic #16-Ubuntu
[ 238.872973] task: c000000fe01ce860 ti: c000000fe022c000 task.ti: c000000fe022c000
[ 238.873049] NIP: c000000000646534 LR: c0000000006475f8 CTR: c000000000646500
[ 238.873125] REGS: c000000fe022f990 TRAP: 0300 Not tainted (4.3.0-5-generic)
[ 238.873200] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28004222 XER: 20000000
[ 238.873392] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
GPR00: c0000000006475f8 c000000fe022fc10 c00000000155e400 0000000000000063
GPR04: c0000007fc648450 c0000007fc659cf0 c000001fff830000 0000000000000792
GPR08: 0000000000000007 0000000000000001 0000000000000000 c000001fff861780
GPR12: c000000000646500 c000000007b87d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000010009d88 0000000000000001
GPR24: 0000000010009d88 00003fffe7b210b0 c0000000014a5cb0 0000000000000004
GPR28: c0000000014a6070 0000000000000063 c000000001460de4 0000000000000000
[ 238.875062] NIP [c000000000646534] sysrq_handle_crash+0x34/0x50
[ 238.875178] LR [c0000000006475f8] __handle_sysrq+0xe8/0x280
[ 238.875270] Call Trace:
[ 238.875322] [c000000fe022fc10] [c000000000dc92a0] _fw_tigon_tg3_bin_name+0x2c5d0/0x33708 (unreliable)
[ 238.875516] [c000000fe022fc30] [c0000000006475f8] __handle_sysrq+0xe8/0x280
[ 238.875658] [c000000fe022fcd0] [c000000000647da8] write_sysrq_trigger+0x78/0xa0
[ 238.875820] [c000000fe022fd00] [c00000000036bf50] proc_reg_write+0xb0/0x110
[ 238.875963] [c000000fe022fd50] [c0000000002d45bc] __vfs_write+0x6c/0xe0
[ 238.876104] [c000000fe022fd90] [c0000000002d52f0] vfs_write+0xc0/0x230
[ 238.876246] [c000000fe022fde0] [c0000000002d632c] SyS_write+0x6c/0x110
[ 238.876389] [c000000fe022fe30] [c000000000009204] system_call+0x38/0xb4
[ 238.876525] Instruction dump:
[ 238.876601] 38427f00 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001a 3949aae4
[ 238.876843] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 238.877091] ---[ end trace 2028716a4fb3f0e5 ]---
[ 238.880521]
[ 238.880590] Sending IPI to other CPUs
[ 238.881716] IPI complete
The system hang is observed here.
3. No crash dump generated after a reboot.
4. Kdump hang also observed on kvm ,PowerVM as well open power
Stack trace output:
[ 238.875270] Call Trace:
[ 238.875322] [c000000fe022fc10] [c000000000dc92a0] _fw_tigon_tg3_bin_name+0x2c5d0/0x33708 (unreliable)
[ 238.875516] [c000000fe022fc30] [c0000000006475f8] __handle_sysrq+0xe8/0x280
[ 238.875658] [c000000fe022fcd0] [c000000000647da8] write_sysrq_trigger+0x78/0xa0
[ 238.875820] [c000000fe022fd00] [c00000000036bf50] proc_reg_write+0xb0/0x110
[ 238.875963] [c000000fe022fd50] [c0000000002d45bc] __vfs_write+0x6c/0xe0
[ 238.876104] [c000000fe022fd90] [c0000000002d52f0] vfs_write+0xc0/0x230
[ 238.876246] [c000000fe022fde0] [c0000000002d632c] SyS_write+0x6c/0x110
[ 238.876389] [c000000fe022fe30] [c000000000009204] system_call+0x38/0xb4
[ 238.876525] Instruction dump:
[ 238.876601] 38427f00 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001a 3949aae4
[ 238.876843] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 238.877091] ---[ end trace 2028716a4fb3f0e5 ]---
Oops output:
no
System Dump Location:
No dump generated
*Additional Instructions for hathyaga@xxxxxxxxxx, iranna.ankad@xxxxxxxxxx,mputtash@xxxxxxxxxx:
-Post a private note with access information to the machine that the bug is occuring on.
-Attach sysctl -a output output to the bug.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1536904/+subscriptions