kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #131665
[Bug 1487085] [NEW] Ubuntu 14.04.3 LTS Crash in notifier_call_chain after boot
You have been subscribed to a public bug:
---Problem Description---
Installed Ubuntu 14.04.3 LTS on Palmetto and its crashing after booting to login.
This happens every time I boot Ubuntu 14.04.3 LTS. I've reinstalled Ubuntu and replaced the hard disk as well and re-installed. Still crashing.
---uname output---
Linux paul40 3.19.0-26-generic #28~14.04.1-Ubuntu SMP Wed Aug 12 14:10:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = Palmetto
---System Hang---
Ubuntu OS crashes and cannot access host. Must reboot system
---Steps to Reproduce---
Boot system
Oops output:
[ 33.132376] Unable to handle kernel paging request for data at address 0x200000000000000
[ 33.132565] Faulting instruction address: 0xc0000000000dbc60
[ 33.133422] Oops: Kernel access of bad area, sig: 11 [#1]
[ 33.134410] SMP NR_CPUS=2048 NUMA PowerNV
[ 33.134478] Modules linked in: ast ttm drm_kms_helper joydev mac_hid drm hid_generic usbhid hid syscopyarea sysfillrect sysimgblt i2c_algo_bit ofpart cmdlinepart at24 uio_pdrv_genirq powernv_flash mtd ipmi_powernv powernv_rng opal_prd ipmi_msghandler uio uas usb_storage ahci libahci
[ 33.139112] CPU: 24 PID: 0 Comm: swapper/24 Not tainted 3.19.0-26-generic #28~14.04.1-Ubuntu
[ 33.139943] task: c0000000013cccb0 ti: c000000fff700000 task.ti: c000000001448000
[ 33.141642] NIP: c0000000000dbc60 LR: c0000000000dbd94 CTR: 0000000000000000
[ 33.142605] REGS: c000000fff703980 TRAP: 0300 Not tainted (3.19.0-26-generic)
[ 33.143417] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28002888 XER: 00000000
[ 33.144244] CFAR: c000000000008468 DAR: 0200000000000000 DSISR: 40000000 SOFTE: 0
GPR00: c0000000000dbd94 c000000fff703c00 c00000000144cc00 c0000000015f03c0
GPR04: 0000000000000007 c0000000015f03b8 ffffffffffffffff 0000000000000000
GPR08: 0000000000000000 0200000000000000 c00000000006c394 9000000000001003
GPR12: 0000000000002200 c00000000fb8d800 0000000000000058 0000000000000000
GPR16: c000000001448000 c000000001448000 c000000001448080 c000000000e9a880
GPR20: c000000001448080 0000000000000001 0000000000000002 0000000000000012
GPR24: c000000f1e432200 0000000000000000 0000000000000000 c0000000015f03b8
GPR28: 0000000000000007 0000000000000000 c0000000015f03c0 ffffffffffffffff
[ 33.157013] NIP [c0000000000dbc60] notifier_call_chain+0x70/0x100
[ 33.157818] LR [c0000000000dbd94] atomic_notifier_call_chain+0x44/0x60
[ 33.162090] Call Trace:
[ 33.162845] [c000000fff703c00] [0000000000000008] 0x8 (unreliable)
[ 33.163644] [c000000fff703c50] [c0000000000dbd94] atomic_notifier_call_chain+0x44/0x60
[ 33.164647] [c000000fff703c90] [c00000000006f2a8] opal_message_notify+0xa8/0x100
[ 33.165476] [c000000fff703d00] [c0000000000dbc88] notifier_call_chain+0x98/0x100
[ 33.167007] [c000000fff703d50] [c0000000000dbd94] atomic_notifier_call_chain+0x44/0x60
[ 33.167816] [c000000fff703d90] [c00000000006f654] opal_do_notifier.part.5+0x74/0xa0
[ 33.172166] [c000000fff703dd0] [c00000000006f6d8] opal_interrupt+0x58/0x70
[ 33.172997] [c000000fff703e10] [c0000000001273d0] handle_irq_event_percpu+0x90/0x2b0
[ 33.174507] [c000000fff703ed0] [c000000000127658] handle_irq_event+0x68/0xd0
[ 33.175312] [c000000fff703f00] [c00000000012baf4] handle_fasteoi_irq+0xe4/0x240
[ 33.176124] [c000000fff703f30] [c0000000001265c8] generic_handle_irq+0x58/0x90
[ 33.176936] [c000000fff703f60] [c000000000010f10] __do_irq+0x80/0x190
[ 33.182406] [c000000fff703f90] [c00000000002476c] call_do_irq+0x14/0x24
[ 33.183258] [c00000000144ba30] [c0000000000110c0] do_IRQ+0xa0/0x120
[ 33.184072] [c00000000144ba90] [c0000000000025d8] hardware_interrupt_common+0x158/0x180
[ 33.184907] --- interrupt: 501 at arch_local_irq_restore+0x5c/0x90
[ 33.184907] LR = arch_local_irq_restore+0x40/0x90
[ 33.186473] [c00000000144bd80] [c000000f2ae19808] 0xc000000f2ae19808 (unreliable)
[ 33.188024] [c00000000144bda0] [c00000000085d5d8] cpuidle_enter_state+0xa8/0x260
[ 33.192695] [c00000000144be00] [c000000000108be8] cpu_startup_entry+0x488/0x4e0
[ 33.193543] [c00000000144bee0] [c00000000000bdb4] rest_init+0xa4/0xc0
[ 33.194327] [c00000000144bf00] [c000000000da3e80] start_kernel+0x53c/0x558
[ 33.195084] [c00000000144bf90] [c000000000008c6c] start_here_common+0x20/0xa8
[ 33.196569] Instruction dump:
[ 33.196619] 7cfd3b78 60000000 60000000 e93e0000 2fa90000 419e00a4 2fbf0000 419e009c
[ 33.197605] 2e3d0000 60000000 60000000 60420000 <e9490000> ebc90008 7d234b78 7f84e378
[ 33.202763] ---[ end trace 71076895a9f126ba ]---
[ 33.202836]
[ 35.203605] Kernel panic - not syncing: Fatal exception in interrupt
[ 35.203727] drm_kms_helper: panic occurred, switching back to text console
[ 35.204692] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Ah! This is due to notifier chain array overflow while handling opal message. The upstream commit 792f96e fixes this issue.. But what I see is the commit 792f96e has been partially applied to ubuntu 14.04.3 kernel sources. And hence you are seeing this issue.
commit 792f96e9a769b799a2944e9369e4ea1e467135b2
Author: Neelesh Gupta <neelegup@xxxxxxxxxxxxxxxxxx>
Date: Wed Feb 11 11:57:06 2015 +0530
powerpc/powernv: Fix the overflow of OPAL message notifiers head array
Fixes the condition check of incoming message type which can
otherwise shoot beyond the message notifiers head array.
Signed-off-by: Neelesh Gupta <neelegup@xxxxxxxxxxxxxxxxxx>
Reviewed-by: Vasant Hegde <hegdevasant@xxxxxxxxxxxxxxxxxx>
Reviewed-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
Below is the hunk from above commit, which is missing from ubuntu 14.04.3:
------------------------------------------------
@@ -354,7 +350,7 @@ static void opal_handle_message(void)
type = be32_to_cpu(msg.msg_type);
/* Sanity check */
- if (type > OPAL_MSG_TYPE_MAX) {
+ if (type >= OPAL_MSG_TYPE_MAX) {
pr_warning("%s: Unknown message type: %u\n", __func__, type);
return;
}
------------------------------------------------
I just checked. The above hunk can be cleanly applied to ubuntu 14.04.3
kernel sources. We should mirror this bug to ubuntu and ask them to
apply above hunk.
** Affects: linux (Ubuntu)
Importance: Undecided
Assignee: Taco Screen team (taco-screen-team)
Status: New
** Tags: architecture-ppc64le bot-comment bugnameltc-129216 severity-high targetmilestone-inin14043
--
Ubuntu 14.04.3 LTS Crash in notifier_call_chain after boot
https://bugs.launchpad.net/bugs/1487085
You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.