← Back to team overview

kernel-packages team mailing list archive

[Bug 1318551] Re: Kernel Panic - not syncing: An NMI occurred, please see the Integrated Management Log for details.

 

POSSIBLE ONLINE WORKAROUND:

Please observe the C-states from your CPUs using " powertop ".

# apt-get install powertop
# powertop
<tab><tab>

After observing that they might be oscilating between C-states (C0/C1E/C3/C6)...
please do execute the following script:

(save this script as "keepcstates.sh")

----- cut here -------

#!/bin/bash

# Workaround to keep CPU at C-states up to C1E
#
# Note: This script should be running all the time
# since the /dev/cpu_dma_latency file must be kept
# opened for it to have an effect.
#
# Lowest value to keep CPUs into C0/C1E state: \013
# You can also use value: \000\000\000\000 to
# completely deactive C-states for CPUs
#

exec 3>/dev/cpu_dma_latency
echo -ne '\013\000\000\000' >&3
while true; do sleep 2; done
exec 3>&-

----- cut here -------

and run, as root, the following command:

root@workstation:~# nohup ./keepcstates.sh &
[1] 28482

This script will run forever not consuming much CPU.

Observe powertop tool again.

In our labs, this script made sure that our CPUs were kept in between C0
and C1E states.

If you continue experiencing panics (with this script running) please
change the line:

echo -ne '\013\000\000\000' >&3

for

echo -ne '\000\000\000\000' >&3

and re-run the script.

With 013 your CPUs will probably oscilate between C0 and C1E states.
With 000 your CPUs will not oscilate and will be kept on C0 state (no power saving at all).

This might guarantee some stability to your system while you cannot
reboot servers to use acpi_idle (instead of intel_idle) module AND/OR
fix intel_idle code properly.

Looking forward to hear back from community (to confirm if, with the
script running, your CPU C-states were kept between C0 and C1E.. AND to
confirm systems became more stable).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1318551

Title:
  Kernel Panic - not syncing: An NMI occurred, please see the Integrated
  Management Log for details.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu Server 14.04 amd64
  Linux global04-jobs2 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

  HW: HP DL380p Gen8 / Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz / 256GB

  
  global04-jobs2 login: [203930.116834] Kernel panic - not syncing: An NMI occurred, please see the Integrated Management Log for details.
  [203930.116834] 
  [203930.174171] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-24-generic #46-Ubuntu
  [203930.211766] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203930.249588]  0000b9792e2c349e ffff881fbfa06dd0 ffffffff81715a64 ffffffffa02c02d8
  [203930.286130]  ffff881fbfa06e48 ffffffff8170ec65 0000000000000008 ffff881fbfa06e58
  [203930.322243]  ffff881fbfa06df8 0000000000000000 ffffc90029274072 0000000000000001
  [203930.358126] Call Trace:
  [203930.370201]  <NMI>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203930.407285]  [<ffffffff8170ec65>] panic+0xc8/0x1d7
  [203930.430907]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203930.461560]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203930.487228]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203930.516179]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203930.540958]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203930.567543]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203930.593928]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203930.619941]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203930.646100]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203930.678933]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203930.707409]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203930.734325]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203930.763715]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203930.789115]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203930.815958]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203930.844461]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203930.874616]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203930.907482]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203930.942571] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203930.983822] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.026586] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.068394] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.110558] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.151923] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.189421] ------------[ cut here ]------------
  [203931.212832] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/kernel/rcu/tree.c:508 rcu_eqs_exit_common.isra.48+0x110/0x120()
  [203931.269279] Modules linked in: veth bridge bonding dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison libcrc32c gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul sb_edac glue_helper 8021q ablk_helper cryptd hpwdt hpilo edac_core ioatdma lpc_ich psmouse garp serio_raw stp mrp llc acpi_power_meter ipmi_si mac_hid lp parport igb i2c_algo_bit tg3 dca ptp hpsa pps_core
  [203931.488907] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-24-generic #46-Ubuntu
  [203931.526932] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203931.559891]  0000000000000009 ffff881fbfa03ed0 ffffffff81715a64 0000000000000000
  [203931.597473]  ffff881fbfa03f08 ffffffff810676bd 0000000000000001 0000000000000046
  [203931.634669]  0000000000000000 0000000000000000 ffffffff81c93398 ffff881fbfa03f18
  [203931.696007] Call Trace:
  [203931.708590]  <IRQ>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203931.737818]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [203931.767951]  [<ffffffff8106779a>] warn_slowpath_null+0x1a/0x20
  [203931.797160]  [<ffffffff810c8720>] rcu_eqs_exit_common.isra.48+0x110/0x120
  [203931.831002]  [<ffffffff810caf05>] rcu_irq_enter+0x75/0xa0
  [203931.857924]  [<ffffffff8106cea7>] irq_enter+0x17/0xa0
  [203931.883727]  [<ffffffff8109894e>] scheduler_ipi+0x4e/0x1d0
  [203931.911444]  [<ffffffff810404ca>] smp_reschedule_interrupt+0x2a/0x30
  [203931.943438]  [<ffffffff8172781d>] reschedule_interrupt+0x6d/0x80
  [203931.973619]  <EOI>  <NMI>  [<ffffffff8170ed33>] ? panic+0x196/0x1d7
  [203932.005056]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203932.036609]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203932.063827]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203932.093283]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203932.118263]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203932.145761]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203932.172752]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203932.199896]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203932.227251]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203932.260781]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203932.290037]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203932.316383]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203932.345426]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203932.370610]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203932.398445]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203932.428250]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203932.459464]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203932.492000]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203932.522502] ---[ end trace 1b2caf07f75276b5 ]---
  [203932.546584] ------------[ cut here ]------------
  [203932.570277] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/kernel/rcu/tree.c:388 rcu_eqs_enter_common.isra.47+0x210/0x220()
  [203932.629000] Modules linked in: veth bridge bonding dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison libcrc32c gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul sb_edac glue_helper 8021q ablk_helper cryptd hpwdt hpilo edac_core ioatdma lpc_ich psmouse garp serio_raw stp mrp llc acpi_power_meter ipmi_si mac_hid lp parport igb i2c_algo_bit tg3 dca ptp hpsa pps_core
  [203932.840564] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W    3.13.0-24-generic #46-Ubuntu
  [203932.882979] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203932.915861]  0000000000000009 ffff881fbfa03ec0 ffffffff81715a64 0000000000000000
  [203932.953260]  ffff881fbfa03ef8 ffffffff810676bd ffff881fbfa0e600 0000000000000001
  [203932.990936]  ffff881fbfa0e600 0000000000000000 0000000000000000 ffff881fbfa03f08
  [203933.028255] Call Trace:
  [203933.041524]  <IRQ>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203933.069392]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [203933.099631]  [<ffffffff8106779a>] warn_slowpath_null+0x1a/0x20
  [203933.129192]  [<ffffffff810c80d0>] rcu_eqs_enter_common.isra.47+0x210/0x220
  [203933.163019]  [<ffffffff810ca35d>] rcu_irq_exit+0x6d/0xa0
  [203933.189393]  [<ffffffff8106cf9b>] irq_exit+0x6b/0x110
  [203933.214830]  [<ffffffff810989ae>] scheduler_ipi+0xae/0x1d0
  [203933.242468]  [<ffffffff810404ca>] smp_reschedule_interrupt+0x2a/0x30
  [203933.274067]  [<ffffffff8172781d>] reschedule_interrupt+0x6d/0x80
  [203933.304620]  <EOI>  <NMI>  [<ffffffff8170ed33>] ? panic+0x196/0x1d7
  [203933.336114]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203933.368295]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203933.394723]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203933.423870]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203933.448658]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203933.476063]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203933.502937]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203933.530842]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203933.558576]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203933.592501]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203933.622438]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203933.649036]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203933.678640]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203933.704425]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203933.732692]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203933.762571]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203933.794029]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203933.826003]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203933.856852] ---[ end trace 1b2caf07f75276b6 ]---
  [203933.879841] ------------[ cut here ]------------
  [203933.903067] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5d/0x60()
  [203933.962151] Modules linked in: veth bridge bonding dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison libcrc32c gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul sb_edac glue_helper 8021q ablk_helper cryptd hpwdt hpilo edac_core ioatdma lpc_ich psmouse garp serio_raw stp mrp llc acpi_power_meter ipmi_si mac_hid lp parport igb i2c_algo_bit tg3 dca ptp hpsa pps_core
  [203934.173264] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W    3.13.0-24-generic #46-Ubuntu
  [203934.216377] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203934.249431]  0000000000000009 ffff881fbfa03d90 ffffffff81715a64 0000000000000000
  [203934.287108]  ffff881fbfa03dc8 ffffffff810676bd 0000000000000001 ffff881fbfa14440
  [203934.324901]  000000010308f9c3 0000000000000000 ffff881fbfa34440 ffff881fbfa03dd8
  [203934.362475] Call Trace:
  [203934.375125]  <IRQ>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203934.404219]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [203934.434625]  [<ffffffff8106779a>] warn_slowpath_null+0x1a/0x20
  [203934.464213]  [<ffffffff8104023d>] native_smp_send_reschedule+0x5d/0x60
  [203934.497051]  [<ffffffff810a7ffa>] trigger_load_balance+0x16a/0x1e0
  [203934.527951]  [<ffffffff810992b4>] scheduler_tick+0xa4/0xf0
  [203934.555529]  [<ffffffff81076230>] update_process_times+0x60/0x70
  [203934.585948]  [<ffffffff810d5be5>] tick_sched_handle.isra.17+0x25/0x60
  [203934.619005]  [<ffffffff810d5c61>] tick_sched_timer+0x41/0x60
  [203934.647109]  [<ffffffff8108e537>] __run_hrtimer+0x77/0x1d0
  [203934.674547]  [<ffffffff810d5c20>] ? tick_sched_handle.isra.17+0x60/0x60
  [203934.707593]  [<ffffffff8108ed3f>] hrtimer_interrupt+0xef/0x230
  [203934.737132]  [<ffffffff81043087>] local_apic_timer_interrupt+0x37/0x60
  [203934.769972]  [<ffffffff817287ff>] smp_apic_timer_interrupt+0x3f/0x60
  [203934.801850]  [<ffffffff8172719d>] apic_timer_interrupt+0x6d/0x80
  [203934.831941]  <EOI>  <NMI>  [<ffffffff8170ed33>] ? panic+0x196/0x1d7
  [203934.864271]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203934.896292]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203934.923132]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203934.952363]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203934.977834]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203935.005817]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203935.033184]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203935.060362]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203935.087458]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203935.123200]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203935.152205]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203935.177872]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203935.207919]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203935.233858]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203935.261678]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203935.293191]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203935.324069]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203935.357138]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203935.388293] ---[ end trace 1b2caf07f75276b7 ]---

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-3.13.0-24-generic 3.13.0-24.46
  ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
  Uname: Linux 3.13.0-24-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 May 12 10:13 seq
   crw-rw---- 1 root audio 116, 33 May 12 10:13 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  Date: Mon May 12 10:50:00 2014
  HibernationDevice: RESUME=/dev/mapper/system-swap
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: HP ProLiant DL380p Gen8
  PciMultimedia:
   
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/system-root ro console=tty0 console=tty1 console=ttyS0,115200n8 swapaccount=1 net.ifnames=1 biosdevname=0
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-24-generic N/A
   linux-backports-modules-3.13.0-24-generic  N/A
   linux-firmware                             1.127
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 02/10/2014
  dmi.bios.vendor: HP
  dmi.bios.version: P70
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: dmi:bvnHP:bvrP70:bd02/10/2014:svnHP:pnProLiantDL380pGen8:pvr:cvnHP:ct23:cvr:
  dmi.product.name: ProLiant DL380p Gen8
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1318551/+subscriptions


References