kernel-packages team mailing list archive

Thread
Date
[Bug 1350443] Re: Not able to load the kdump kernel and generate the dump in Ubuntu14.10 on Non virtualised system

To: kernel-packages@xxxxxxxxxxxxxxxxxxx
From: Chris J Arges <1350443@xxxxxxxxxxxxxxxxxx>
Date: Wed, 01 Oct 2014 16:47:36 -0000
Reply-to: Bug 1350443 <1350443@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
*** This bug is a duplicate of bug 1364427 ***
    https://bugs.launchpad.net/bugs/1364427

** This bug has been marked a duplicate of bug 1364427
   kexeced kernel hung

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1350443

Title:
  Not able to load the kdump kernel and generate the dump in Ubuntu14.10
  on Non virtualised system

Status in “makedumpfile” package in Ubuntu:
  In Progress

Bug description:
  ---Problem Description---
  Not able to load the kdump kernel and save the vmcore in /var/crash/
    
  ---uname output---
  Linux lep8d 3.16.0-5-generic #10-Ubuntu SMP Mon Jul 21 16:17:25 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = P8 
    
  ---Steps to Reproduce---
  Install a P8 machine with Ubuntu 14.10 in Non virtualised environment.
  Installed all the kexec-tools and kdump-tools packages.

  Then trying to start the kdump service and loading.

  root@lep8d:~# /etc/init.d/kdump-tools start
  Starting kdump-tools: Cannot open `/boot/vmlinuz-3.16.0-5-generic': No such file or directory
   * failed to load kdump kernel
  root@lep8d:~# echo $?
  0

  root@lep8d:~# kdump-config load
  Cannot open `/boot/vmlinuz-3.16.0-5-generic': No such file or directory
   * failed to load kdump kernel

  root@lep8d:~# ls -l /boot/vmlinux-3.16.0-5-generic
  -rw------- 1 root root 20712936 Jul 21 22:02 /boot/vmlinux-3.16.0-5-generic
  root@lep8d:~#

  root@lep8d:~# dpkg -l | grep kexec
  ii  kexec-tools                                            1:2.0.6-0ubuntu2                           ppc64el      tools to support fast kexec reboots
  ii  pxe-kexec                                              0.2.4-3                                    ppc64el      Fetch PXE configuration file and netboot using kexec
  root@lep8d:~# dpkg -l | grep kdump
  ii  kdump-tools                                            1.5.6-2                                    all          scripts and tools for automating kdump (Linux crash dumps)

  root@lep8d:~# cat /sys/kernel/kexec_crash_loaded
  0

  root@lep8d:~# kdump-config show
  USE_KDUMP:        1
  KDUMP_SYSCTL:     kernel.panic_on_oops=1
  KDUMP_COREDIR:    /var/crash
  crashkernel addr:
  current state:    Not ready to kdump

  kexec command:
    no kexec command recorded
  root@lep8d:~# kdump-config status
  current state   : Not ready to kdump

  Tried to manually trigger a crash as below:

  root@lep8d:~# sysctl -w kernel.sysrq=1
  kernel.sysrq = 1
  root@lep8d:~# cat /proc/sys/kernel/sysrq
  1
  root@lep8d:~# echo c > /proc/sysrq-trigger

  [ 4252.703681] SysRq : Trigger a crash
  [ 4252.703773] Unable to handle kernel paging request for data at address 0x00000000
  [ 4252.703779] Faulting instruction address: 0xc0000000005b88f4
  [ 4252.703807] Oops: Kernel access of bad area, sig: 11 [#1]
  [ 4252.703852] SMP NR_CPUS=2048 NUMA PowerNV
  [ 4252.703899] Modules linked in: dm_multipath scsi_dh shpchp powernv_rng uio_pdrv_genirq uio rtc_generic binfmt_misc parport_pc ppdev lp parport ses enclosure lpfc scsi_transport_fc ipr scsi_tgt
  [ 4252.704162] CPU: 76 PID: 4635 Comm: bash Not tainted 3.16.0-5-generic #10-Ubuntu
  [ 4252.704230] task: c000001fdf7cbeb0 ti: c000001fdf8e4000 task.ti: c000001fdf8e4000
  [ 4252.704298] NIP: c0000000005b88f4 LR: c0000000005b997c CTR: c0000000005b88c0
  [ 4252.704365] REGS: c000001fdf8e79d0 TRAP: 0300   Not tainted  (3.16.0-5-generic)
  [ 4252.704432] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28422824  XER: 20000000
  [ 4252.704602] CFAR: c000000000009358 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
  GPR00: c0000000005b997c c000001fdf8e7c50 c000000001346498 0000000000000063
  GPR04: c000000014305db0 c000000014316618 0000000000018010 c0000000014ff2d8
  GPR08: c000000000dd6498 0000000000000001 0000000000000000 0000000000018010
  GPR12: c0000000005b88c0 c000000007e50a00 0000010016f94818 000000001016e008
  GPR16: 000000001013ad70 0000010016f9c958 000000001016fed0 000000001016e008
  GPR20: 00000000100c31e0 0000000000000000 0000000010171fc8 000000001016f840
  GPR24: 000000001014d9b0 000000001014d0b0 0000000000000004 0000000000000000
  GPR28: c00000000127eee8 0000000000000063 c00000000125d6a0 c00000000127f2a8
  [ 4252.705502] NIP [c0000000005b88f4] sysrq_handle_crash+0x34/0x50
  [ 4252.705558] LR [c0000000005b997c] __handle_sysrq+0xec/0x270
  [ 4252.705604] Call Trace:
  [ 4252.705631] [c000001fdf8e7c50] [c00000000018e3f0] __acct_update_integrals+0x80/0x170 (unreliable)
  [ 4252.705722] [c000001fdf8e7c70] [c0000000005b997c] __handle_sysrq+0xec/0x270
  [ 4252.705790] [c000001fdf8e7d10] [c0000000005ba138] write_sysrq_trigger+0x78/0xa0
  [ 4252.705871] [c000001fdf8e7d40] [c0000000003141f0] proc_reg_write+0xb0/0x110
  [ 4252.705940] [c000001fdf8e7d90] [c00000000028c07c] vfs_write+0xdc/0x260
  [ 4252.706007] [c000001fdf8e7de0] [c00000000028ce1c] SyS_write+0x6c/0x110
  [ 4252.706076] [c000001fdf8e7e30] [c00000000000a0fc] syscall_exit+0x0/0x7c
  [ 4252.706143] Instruction dump:
  [ 4252.706177] 3842dbd8 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001b 3949a954
  [ 4252.706290] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
  [ 4252.706406] ---[ end trace 4473f520a462be3e ]---
  [ 4252.706452]
  [ 4254.712133] Kernel panic - not syncing: Fatal exception
  [ 4254.712297] Rebooting in 1 seconds..

  root@lep8d:~# dmesg | grep -i crash
  [    0.000000] Reserving 128MB of memory at 128MB for crashkernel (System RAM: 196608MB)
  [    0.000000] Kernel command line: root=UUID=6c354e14-fa77-4e99-899b-dd21a3627c62 ro splash quiet crashkernel=384M-:128M

  After triggering the crash, the machine rebooted and got the info in
  the ipmi console.

   Petitboot Option Editor
   ------------------------------------------------------------------------------

   Device:         (*) sdb2 [6c354e14-fa77-4e99-899b-dd21a3627c62]
                   ( ) Specify paths/URLs manually

   Kernel:         /boot/vmlinux-3.16.0-5-generic
   Initrd:         /boot/initrd.img-3.16.0-5-generic
   Device tree:
   Boot arguments: 4e99-899b-dd21a3627c62 ro splash quiet crashkernel=384M-:128M

           [  OK  ]  [ Help ]  [Cancel]

  The crashkernel parameter is set in the boot arguments as can be seen
  in menu.

  But once the machine boots to ubuntu 14.10 kernel, no vmcore is
  generated in /var/crash folder.

  Looks like kdump-config takes /boot/vmlinuz-`uname -r` as the kdump kernel by default.
  One way to deal with this is to set KDUMP_KERNEL & KDUMP_INITRD variables
  in /etc/default/kdump-tools configuration file.
  Alternatively, there may be a need to add a provision in kdump configuration file
  (/etc/default/kdump-tools) to add the type of kernel image (vmlinux/vmlinuz)
  and make kdump-config script load the appropriate kernel image based on
  the configuration.
  Once this is sorted out, before testing kdump/kexec make sure the following patch
  http://lists.infradead.org/pipermail/kexec/2014-July/012247.html is part of kexec-tools
  as kdump is bound to fail otherwise as mentioned in LP bug 1349994

  Regarding kdump-config issue mentioned by Hari, the patch mentioned in this launchpad's bug should be applied:
  https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1324544

  Need to ask Ubuntu to include the patches provided in this bugzilla to be incorporated for Ubuntu 14.10.
  We are currently testing Ubuntu 14.10 and the bug is occurring in this release.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1350443/+subscriptions