← Back to team overview

kernel-packages team mailing list archive

[Bug 1318551] Re: Kernel Panic - not syncing: An NMI occurred, please see the Integrated Management Log for details.

 

" " "
>From Intel Manual:

No additional power reduction actions are taken in the package C1 state. However, if
the C1E substate is enabled, the processor automatically transitions to the lowest
supported core clock frequency, followed by a reduction in voltage. Autonomous power
reduction actions which are based on idle timers, can trigger depending on the activity
in the system.
The package enters the C1 low power state when:
• At least one core is in the C1 state.
• The other cores are in a C1 or lower power state.
The package enters the C1E state when:
• All cores have directly requested C1E via MWAIT(C1) with a C1E sub-state hint.
• All cores are in a power state lower that C1/C1E but the package low power state is
limited to C1/C1E via the PMG_CST_CONFIG_CONTROL MSR.
• All cores have requested C1 using HLT or MWAIT(C1) and C1E auto-promotion is
enabled in POWER_CTL.
No notification to the system occurs upon entry to C1/C1E.
" " "

Whenever Linux request a MWAIT instruction the cores will be put into C1E
state (as you can see on the second powertop screenshot).

I recommend anyone to apply this workaround slowly in a big number of nodes
environment (if a reboot can't be made). This will allow you to follow power 
consumption and heat generated by this change.

>From the suggested KERNEL CMDLINE (GRUB):

"intel_idle.max_cstate=0 nox2apic intremap=off"

intel_idle.max_cstates=0 will disable intel_idle module and enable acpi_idle
module respecting proper BIOS ACPI tables. 

Next: To understand what is causing this behavior from intel_idle code 
(not respecting ACPI tables and maybe causing NMIs trying to manage
C-states not compatible with processor type).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1318551

Title:
  Kernel Panic - not syncing: An NMI occurred, please see the Integrated
  Management Log for details.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu Server 14.04 amd64
  Linux global04-jobs2 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

  HW: HP DL380p Gen8 / Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz / 256GB

  
  global04-jobs2 login: [203930.116834] Kernel panic - not syncing: An NMI occurred, please see the Integrated Management Log for details.
  [203930.116834] 
  [203930.174171] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-24-generic #46-Ubuntu
  [203930.211766] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203930.249588]  0000b9792e2c349e ffff881fbfa06dd0 ffffffff81715a64 ffffffffa02c02d8
  [203930.286130]  ffff881fbfa06e48 ffffffff8170ec65 0000000000000008 ffff881fbfa06e58
  [203930.322243]  ffff881fbfa06df8 0000000000000000 ffffc90029274072 0000000000000001
  [203930.358126] Call Trace:
  [203930.370201]  <NMI>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203930.407285]  [<ffffffff8170ec65>] panic+0xc8/0x1d7
  [203930.430907]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203930.461560]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203930.487228]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203930.516179]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203930.540958]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203930.567543]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203930.593928]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203930.619941]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203930.646100]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203930.678933]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203930.707409]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203930.734325]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203930.763715]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203930.789115]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203930.815958]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203930.844461]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203930.874616]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203930.907482]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203930.942571] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203930.983822] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.026586] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.068394] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.110558] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.151923] ERST: [Firmware Warn]: Firmware does not respond in time.
  [203931.189421] ------------[ cut here ]------------
  [203931.212832] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/kernel/rcu/tree.c:508 rcu_eqs_exit_common.isra.48+0x110/0x120()
  [203931.269279] Modules linked in: veth bridge bonding dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison libcrc32c gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul sb_edac glue_helper 8021q ablk_helper cryptd hpwdt hpilo edac_core ioatdma lpc_ich psmouse garp serio_raw stp mrp llc acpi_power_meter ipmi_si mac_hid lp parport igb i2c_algo_bit tg3 dca ptp hpsa pps_core
  [203931.488907] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-24-generic #46-Ubuntu
  [203931.526932] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203931.559891]  0000000000000009 ffff881fbfa03ed0 ffffffff81715a64 0000000000000000
  [203931.597473]  ffff881fbfa03f08 ffffffff810676bd 0000000000000001 0000000000000046
  [203931.634669]  0000000000000000 0000000000000000 ffffffff81c93398 ffff881fbfa03f18
  [203931.696007] Call Trace:
  [203931.708590]  <IRQ>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203931.737818]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [203931.767951]  [<ffffffff8106779a>] warn_slowpath_null+0x1a/0x20
  [203931.797160]  [<ffffffff810c8720>] rcu_eqs_exit_common.isra.48+0x110/0x120
  [203931.831002]  [<ffffffff810caf05>] rcu_irq_enter+0x75/0xa0
  [203931.857924]  [<ffffffff8106cea7>] irq_enter+0x17/0xa0
  [203931.883727]  [<ffffffff8109894e>] scheduler_ipi+0x4e/0x1d0
  [203931.911444]  [<ffffffff810404ca>] smp_reschedule_interrupt+0x2a/0x30
  [203931.943438]  [<ffffffff8172781d>] reschedule_interrupt+0x6d/0x80
  [203931.973619]  <EOI>  <NMI>  [<ffffffff8170ed33>] ? panic+0x196/0x1d7
  [203932.005056]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203932.036609]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203932.063827]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203932.093283]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203932.118263]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203932.145761]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203932.172752]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203932.199896]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203932.227251]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203932.260781]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203932.290037]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203932.316383]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203932.345426]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203932.370610]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203932.398445]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203932.428250]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203932.459464]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203932.492000]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203932.522502] ---[ end trace 1b2caf07f75276b5 ]---
  [203932.546584] ------------[ cut here ]------------
  [203932.570277] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/kernel/rcu/tree.c:388 rcu_eqs_enter_common.isra.47+0x210/0x220()
  [203932.629000] Modules linked in: veth bridge bonding dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison libcrc32c gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul sb_edac glue_helper 8021q ablk_helper cryptd hpwdt hpilo edac_core ioatdma lpc_ich psmouse garp serio_raw stp mrp llc acpi_power_meter ipmi_si mac_hid lp parport igb i2c_algo_bit tg3 dca ptp hpsa pps_core
  [203932.840564] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W    3.13.0-24-generic #46-Ubuntu
  [203932.882979] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203932.915861]  0000000000000009 ffff881fbfa03ec0 ffffffff81715a64 0000000000000000
  [203932.953260]  ffff881fbfa03ef8 ffffffff810676bd ffff881fbfa0e600 0000000000000001
  [203932.990936]  ffff881fbfa0e600 0000000000000000 0000000000000000 ffff881fbfa03f08
  [203933.028255] Call Trace:
  [203933.041524]  <IRQ>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203933.069392]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [203933.099631]  [<ffffffff8106779a>] warn_slowpath_null+0x1a/0x20
  [203933.129192]  [<ffffffff810c80d0>] rcu_eqs_enter_common.isra.47+0x210/0x220
  [203933.163019]  [<ffffffff810ca35d>] rcu_irq_exit+0x6d/0xa0
  [203933.189393]  [<ffffffff8106cf9b>] irq_exit+0x6b/0x110
  [203933.214830]  [<ffffffff810989ae>] scheduler_ipi+0xae/0x1d0
  [203933.242468]  [<ffffffff810404ca>] smp_reschedule_interrupt+0x2a/0x30
  [203933.274067]  [<ffffffff8172781d>] reschedule_interrupt+0x6d/0x80
  [203933.304620]  <EOI>  <NMI>  [<ffffffff8170ed33>] ? panic+0x196/0x1d7
  [203933.336114]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203933.368295]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203933.394723]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203933.423870]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203933.448658]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203933.476063]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203933.502937]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203933.530842]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203933.558576]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203933.592501]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203933.622438]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203933.649036]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203933.678640]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203933.704425]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203933.732692]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203933.762571]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203933.794029]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203933.826003]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203933.856852] ---[ end trace 1b2caf07f75276b6 ]---
  [203933.879841] ------------[ cut here ]------------
  [203933.903067] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5d/0x60()
  [203933.962151] Modules linked in: veth bridge bonding dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison libcrc32c gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul sb_edac glue_helper 8021q ablk_helper cryptd hpwdt hpilo edac_core ioatdma lpc_ich psmouse garp serio_raw stp mrp llc acpi_power_meter ipmi_si mac_hid lp parport igb i2c_algo_bit tg3 dca ptp hpsa pps_core
  [203934.173264] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W    3.13.0-24-generic #46-Ubuntu
  [203934.216377] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
  [203934.249431]  0000000000000009 ffff881fbfa03d90 ffffffff81715a64 0000000000000000
  [203934.287108]  ffff881fbfa03dc8 ffffffff810676bd 0000000000000001 ffff881fbfa14440
  [203934.324901]  000000010308f9c3 0000000000000000 ffff881fbfa34440 ffff881fbfa03dd8
  [203934.362475] Call Trace:
  [203934.375125]  <IRQ>  [<ffffffff81715a64>] dump_stack+0x45/0x56
  [203934.404219]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [203934.434625]  [<ffffffff8106779a>] warn_slowpath_null+0x1a/0x20
  [203934.464213]  [<ffffffff8104023d>] native_smp_send_reschedule+0x5d/0x60
  [203934.497051]  [<ffffffff810a7ffa>] trigger_load_balance+0x16a/0x1e0
  [203934.527951]  [<ffffffff810992b4>] scheduler_tick+0xa4/0xf0
  [203934.555529]  [<ffffffff81076230>] update_process_times+0x60/0x70
  [203934.585948]  [<ffffffff810d5be5>] tick_sched_handle.isra.17+0x25/0x60
  [203934.619005]  [<ffffffff810d5c61>] tick_sched_timer+0x41/0x60
  [203934.647109]  [<ffffffff8108e537>] __run_hrtimer+0x77/0x1d0
  [203934.674547]  [<ffffffff810d5c20>] ? tick_sched_handle.isra.17+0x60/0x60
  [203934.707593]  [<ffffffff8108ed3f>] hrtimer_interrupt+0xef/0x230
  [203934.737132]  [<ffffffff81043087>] local_apic_timer_interrupt+0x37/0x60
  [203934.769972]  [<ffffffff817287ff>] smp_apic_timer_interrupt+0x3f/0x60
  [203934.801850]  [<ffffffff8172719d>] apic_timer_interrupt+0x6d/0x80
  [203934.831941]  <EOI>  <NMI>  [<ffffffff8170ed33>] ? panic+0x196/0x1d7
  [203934.864271]  [<ffffffffa02bf8fd>] hpwdt_pretimeout+0xdd/0xdd [hpwdt]
  [203934.896292]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  [203934.923132]  [<ffffffff8171f108>] nmi_handle.isra.3+0x88/0x180
  [203934.952363]  [<ffffffff8171f3bd>] do_nmi+0x1bd/0x340
  [203934.977834]  [<ffffffff8171e571>] end_repeat_nmi+0x1e/0x2e
  [203935.005817]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203935.033184]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203935.060362]  [<ffffffff813dfd78>] ? intel_idle+0xd8/0x140
  [203935.087458]  <<EOE>>  [<ffffffff815c9570>] cpuidle_enter_state+0x40/0xc0
  [203935.123200]  [<ffffffff815c96a9>] cpuidle_idle_call+0xb9/0x1f0
  [203935.152205]  [<ffffffff8101ceae>] arch_cpu_idle+0xe/0x30
  [203935.177872]  [<ffffffff810beb85>] cpu_startup_entry+0xc5/0x290
  [203935.207919]  [<ffffffff81703f37>] rest_init+0x77/0x80
  [203935.233858]  [<ffffffff81d34f70>] start_kernel+0x438/0x443
  [203935.261678]  [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
  [203935.293191]  [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
  [203935.324069]  [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
  [203935.357138]  [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
  [203935.388293] ---[ end trace 1b2caf07f75276b7 ]---

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-3.13.0-24-generic 3.13.0-24.46
  ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
  Uname: Linux 3.13.0-24-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 May 12 10:13 seq
   crw-rw---- 1 root audio 116, 33 May 12 10:13 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  Date: Mon May 12 10:50:00 2014
  HibernationDevice: RESUME=/dev/mapper/system-swap
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: HP ProLiant DL380p Gen8
  PciMultimedia:
   
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/system-root ro console=tty0 console=tty1 console=ttyS0,115200n8 swapaccount=1 net.ifnames=1 biosdevname=0
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-24-generic N/A
   linux-backports-modules-3.13.0-24-generic  N/A
   linux-firmware                             1.127
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 02/10/2014
  dmi.bios.vendor: HP
  dmi.bios.version: P70
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: dmi:bvnHP:bvrP70:bd02/10/2014:svnHP:pnProLiantDL380pGen8:pvr:cvnHP:ct23:cvr:
  dmi.product.name: ProLiant DL380p Gen8
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1318551/+subscriptions


References