← Back to team overview

kernel-packages team mailing list archive

[Bug 1572712] Re: console hung during Ubuntu 16.04 netboot install 4.4.0-12-generic [soft lockup]

 

I've not done anything proactively to fix this bug. The Xenial kernel is
up to 4.4.0-25.44, so retrying would likely not be a waste of time.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1572712

Title:
  console hung during Ubuntu 16.04 netboot install 4.4.0-12-generic
  [soft lockup]

Status in linux package in Ubuntu:
  New

Bug description:
  Canonical,

  FYI and reference only at this time.

  Details:
  ---Problem Description---
  IPMI Console gets hung for 15 to 30 mins, can not traverse when trying to install 
  latest copy of Ubuntu 16.04 on Habanero from http://ports.ubuntu.com/ubuntu-ports/dists/xenial/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/vmlinux  which is 4.4.0-12-generic 
   [   36.803240] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [swapper/2:0]
   
  ---uname output---
  4.4.0-12-generic 
   
  ---Additional Hardware Info---
  TN71-BP012  

   
  Machine Type = TN71-BP012  
   
  ---System Hang---
   we have to wait for 15 to 30 mins to regain the console access.

   
  ---Steps to Reproduce---
  use IPMI console for  petitboot on habanero, configure IP to a network device, make sure you have internet outbound BSO authenticated. 
  on petit shell run following commands 
  cd /tmp;
  wget http://ports.ubuntu.com/ubuntu-ports/dists/xenial/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/vmlinux ;
  wget http://ports.ubuntu.com/ubuntu-ports/dists/xenial/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/initrd.gz ;
  kexec -l /tmp/vmlinux  -i /tmp/initrd.gz  ;
  kexec -e ;

   
  Stack trace output:
   [   36.803240] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [swapper/2:0]
  [   36.803425] Modules linked in: hid_generic(E) ast(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) ttm
  ) usb_storage(E)
  [   36.803436] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G            E   4.4.0-12-generic #28-Ubuntu
  [   36.803438] task: c000003c8fe8e7b0 ti: c000003c97530000 task.ti: c000003c97530000
  [   36.803440] NIP: c000000000010a24 LR: c000000000010a24 CTR: c00000000060c2f0
  [   36.803442] REGS: c000003c97533510 TRAP: 0901   Tainted: G            E    (4.4.0-12-generic)
  [   36.803443] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28002824  XER: 00000000
  [   36.803448] CFAR: c00000000060c330 SOFTE: 1 
  [   36.803448] GPR00: c0000000000db8b4 c000003c97533790 c0000000015a3b00 0000000000000900 
  [   36.803448] GPR04: c000003c8fef6e00 0000000000000001 0000000000000000 0000000003ffffff 
  [   36.803448] GPR08: 0000000000000000 c000003ffb064605 0000000000000001 0000000000000004 
  [   36.803448] GPR12: c00000000060c2f0 c00000000fb41300 
  [   36.803460] NIP [c000000000010a24] arch_local_irq_restore+0x74/0x90
  [   36.803462] LR [c000000000010a24] arch_local_irq_restore+0x74/0x90
  [   36.803463] Call Trace:
  [   36.803465] [c000003c97533790] [c00000000014b184] mod_timer+0x154/0x300 (unreliable)
  [   36.803468] [c000003c975337b0] [c0000000000db8b4] queue_work_on+0x74/0xf0
  [   36.803471] [c000003c975337f0] [c00000000060c334] cursor_timer_handler+0x44/0x80
  [   36.803474] [c000003c97533820] [c000000000149ebc] call_timer_fn+0x5c/0x1c0
  [   36.803476] [c000003c975338b0] [c00000000014a37c] run_timer_softirq+0x31c/0x3f0
  [   36.803480] [c000003c97533980] [c0000000000beaf8] __do_softirq+0x188/0x3e0
  [   36.803483] [c000003c97533a70] [c0000000000befc8] irq_exit+0xc8/0x100
  [   36.803487] [c000003c97533a90] [c00000000001f954] timer_interrupt+0xa4/0xe0
  [   36.803490] [c000003c97533ac0] [c000000000002714] decrementer_common+0x114/0x180
  [   36.803494] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
  [   36.803494]     LR = arch_local_irq_restore+0x74/0x90
  [   36.803497] [c000003c97533db0] [c000003ffc49d678] 0xc000003ffc49d678 (unreliable)
  [   36.803502] [c000003c97533dd0] [c000000000900708] cpuidle_enter_state+0x1a8/0x410
  [   36.803504] [c000003c97533e30] [c000000000119898] call_cpuidle+0x78/0xd0
  [   36.803506] [c000003c97533e70] [c000000000119c6c] cpu_startup_entry+0x37c/0x490
  [   36.803509] [c000003c97533f30] [c00000000004565c] start_secondary+0x33c/0x360
  [   36.803511] [c000003c97533f90] [c000000000008b6c] start_secondary_prolog+0x10/0x14
  [   36.803512] Instruction dump:
  [   36.803514] 994d02ca 2fa30000 409e0024 e92d0020 61298000 7d210164 38210020 e8010010 
  [   36.803517] 7c0803a6 4e800020 60420000 4bff17ad <60000000> 4bffffe4 60420000 e92d0020 
  [   57.811232] INFO: rcu_sched self-detected stall on CPU
  [   57.811236]  2-...: (1 GPs behind) idle=913/2/0 softirq=1849/1979 fqs=5229 
  [   57.811238]   (t=5250 jiffies g=-114 c=-115 q=1)
  [   57.811242] Task dump for CPU 2:
  [   57.811243] swapper/2       R  running task        0     0      1 0x00000804
  [   57.811246] Call Trace:
  [   57.811248] [c000003c97532fe0] [c0000000000fb8c0] sched_show_task+0xe0/0x180 (unreliable)
  [   57.811251] [c000003c97533050] [c00000000013e714] rcu_dump_cpu_stacks+0xe4/0x150
  [   57.811253] [c000003c975330a0] [c000000000143e24] rcu_check_callbacks+0x6b4/0x9b0
  [   57.811255] [c000003c975331d0] [c00000000014bc88] update_process_times+0x58/0xa0
  [   57.811258] [c000003c97533200] [c000000000162d38] tick_sched_handle.isra.6+0x48/0xe0
  [   57.811260] [c000003c97533240] [c000000000162e34] tick_sched_timer+0x64/0xd0
  [   57.811262] [c000003c97533280] [c00000000014c7b4] __hrtimer_run_queues+0x134/0x470
  [   57.811263] [c000003c97533310] [c00000000014d7ec] hrtimer_interrupt+0xec/0x2c0
  [   57.811266] [c000003c975333d0] [c00000000001f59c] __timer_interrupt+0x8c/0x290
  [   57.811268] [c000003c97533420] [c00000000001f950] timer_interrupt+0xa0/0xe0
  [   57.811269] [c000003c97533450] [c000000000002714] decrementer_common+0x114/0x180
  [   57.811272] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
  [   57.811272]     LR = arch_local_irq_restore+0x74/0x90
  [   57.811274] [c000003c97533740] [c000003c8af61ad8] 0xc000003c8af61ad8 (unreliable)
  [   57.811277] [c000003c97533760] [c000000000ad8cdc] _raw_spin_unlock_irqrestore+0x4c/0xb0

  == Comment: #3 - NAVEED A. UPPINANGADY SALIH - 2016-03-15 07:47:16 ==
  Please note:  we did not face this hung issue with netboot of  4.4.0-11-generic, hence we can call this bug as regression.. 
  Also note, this soft lock up issue does not completely block installation , we can proceed with installation but we do see occasional console hangs and soft lockup console messages.

  == Comment: #19 - Jeremy Kerr - 2016-03-22 06:37:28 ==
  Seeing the same issue with the current installer build (20101020ubuntu440) - soft lockup at the language selection screen, with no input possible.

  It's intermittent, but is happening about two times out of three.

  == Comment: #59 - Murilo Fossa Vicentini - 2016-04-12 17:05:26 ==
  If I blacklist the ast driver (along with drm, ttm and so on) from being loaded by passing modprobe.blacklist=ast in the boot arguments, I don't hit this issue. I also wasn't able to reproduce this issue in a tuleta box, so it seems to be related to the graphics driver.

  Indira can you double check this "workaround" solves the issue in your
  system as well?

  I am uploading the installation logs of a failure in the latest
  netboot image.

  == Comment: #73 - INDIRA P. JOGA <indira.priya@xxxxxxxxxx> - 2016-04-14 06:00:43 ==
  Verified below test scenarios with workaround .Did not hit with BUG soft lockup error and also hang issue. Fix is working fine.

  >  Installed ubuntu16.04 over Travis3EN, while ray system is in Habanero LC FW ( OP810.30 OP810 1603G)
  >  Updated to SL FW (TN71-BP012-810.1611.20160406a)  & reinstalled ray with ubuntu16.04  via Travis3EN.

  Regards,
  Indira

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1572712/+subscriptions