← Back to team overview

kernel-packages team mailing list archive

[Bug 1315736] Re: Machine Check Exception

 

Hi,

I did run the memory test and no errors were detected.

I also changed to the mainline kernel. With the mainline kernel
(3.15.0-031500rc4-generic #201405042135 SMP) I have not seen yet MCE
error or had an unresponsive system, however I can still see some errors
on dmesg:


[  840.160260] INFO: task fastq-join:2753 blocked for more than 120 seconds.
[  840.162350]       Not tainted 3.15.0-031500rc4-generic #201405042135
[  840.164324] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  840.166723] fastq-join      D 000000000000000d     0  2753   2752 0x00000002
[  840.166726]  ffff882e6c2e1bb8 0000000000000002 ffff882d7b74c438 ffff882e6c2e1fd8
[  840.166728]  0000000000014500 0000000000014500 ffff882ff79fcb60 ffff882ff7d89920
[  840.166730]  ffff882e6c2e1bb8 ffff88301f2d4e00 ffff882ff7d89920 ffffffff8115b2a0
[  840.166732] Call Trace:
[  840.166742]  [<ffffffff8115b2a0>] ? __lock_page+0x70/0x70
[  840.166747]  [<ffffffff8175fd99>] schedule+0x29/0x70
[  840.166749]  [<ffffffff8175fe6f>] io_schedule+0x8f/0xd0
[  840.166751]  [<ffffffff8115b2ae>] sleep_on_page+0xe/0x20
[  840.166753]  [<ffffffff81760532>] __wait_on_bit+0x62/0x90
[  840.166755]  [<ffffffff8115bd7b>] ? find_get_pages_tag+0xcb/0x170
[  840.166756]  [<ffffffff8115b410>] wait_on_page_bit+0x80/0x90
[  840.166761]  [<ffffffff810b3970>] ? wake_atomic_t_function+0x40/0x40
[  840.166762]  [<ffffffff8115b5e4>] filemap_fdatawait_range+0xf4/0x180
[  840.166765]  [<ffffffff8115d51d>] filemap_write_and_wait_range+0x4d/0x80
[  840.166777]  [<ffffffffa03e151f>] nfs4_file_fsync+0x5f/0xb0 [nfsv4]
[  840.166781]  [<ffffffff811fbf06>] vfs_fsync+0x26/0x40
[  840.166793]  [<ffffffffa03199ba>] nfs_file_flush+0x8a/0xd0 [nfs]
[  840.166796]  [<ffffffff811ca1ea>] filp_close+0x3a/0x90
[  840.166801]  [<ffffffff811e937a>] put_files_struct.part.12+0x7a/0xd0
[  840.166803]  [<ffffffff811e9725>] put_files_struct+0x15/0x20
[  840.166804]  [<ffffffff811e97f2>] exit_files+0x52/0x60
[  840.166808]  [<ffffffff8106e521>] do_exit+0x171/0x470
[  840.166811]  [<ffffffff81021e45>] ? syscall_trace_enter+0x165/0x280
[  840.166813]  [<ffffffff8106e8b4>] do_group_exit+0x44/0xa0
[  840.166814]  [<ffffffff8106e927>] SyS_exit_group+0x17/0x20
[  840.166817]  [<ffffffff8176d03f>] tracesys+0xe1/0xe6
[ 1461.722458] perf interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[30187.334010] ------------[ cut here ]------------
[30187.335401] kernel BUG at /home/apw/COD/linux/mm/memory.c:3924!
[30187.337183] invalid opcode: 0000 [#1] SMP 
[30187.338459] Modules linked in: rpcsec_gss_krb5 ip6t_REJECT nfsv4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 nfsd dcdbas xt_conntrack auth_rpcgss ip6table_filter x86_pkg_temp_thermal ip6_tables intel_powerclamp nf_conntrack_netbios_ns nfs_acl nf_conntrack_broadcast nfs nf_nat_ftp coretemp nf_nat kvm_intel lockd nf_conntrack_ftp nf_conntrack kvm sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel fscache iptable_filter ip_tables aes_x86_64 x_tables lrw gf128mul glue_helper ablk_helper joydev cryptd mei_me sb_edac acpi_power_meter shpchp mei ipmi_si edac_core mac_hid lpc_ich wmi lp acpi_pad parport hid_generic ahci usbhid tg3 hid libahci ptp megaraid_sas pps_core
[30187.360807] CPU: 19 PID: 29197 Comm: java Not tainted 3.15.0-031500rc4-generic #201405042135
[30187.363359] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS 2.2.2 01/16/2014
[30187.365620] task: ffff882ff7f0b240 ti: ffff882c86216000 task.ti: ffff882c86216000
[30187.367881] RIP: 0010:[<ffffffff81188c2d>]  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30187.370519] RSP: 0018:ffff882c86217d88  EFLAGS: 00010246
[30187.372115] RAX: 0000000000000100 RBX: 00000007ffa00038 RCX: 0000000000000000
[30187.374269] RDX: ffff882ff7f0b240 RSI: 0000000000000009 RDI: 80000001f5e009e6
[30187.376425] RBP: ffff882c86217dc8 R08: 0000000000000000 R09: 00000000000000a9
[30187.378579] R10: 0000000000000000 R11: 0000000000000002 R12: ffff881689222cf0
[30187.380799] R13: ffff882c268abb80 R14: ffff8817f5339fe8 R15: ffff882de1fb80f8
[30187.382954] FS:  00007f9cbbdf4700(0000) GS:ffff88301f320000(0000) knlGS:0000000000000000
[30187.385404] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30187.387131] CR2: 00000007b133f000 CR3: 00000021cc72f000 CR4: 00000000001407e0
[30187.389285] Stack:
[30187.389871]  ffff882c86217f18 ffffffff000000a9 0000000000000001 ffff882ff7f0b240
[30187.392211]  ffff882c268abb80 ffff881689222cf0 ffff881689222cf0 ffff882c268abb80
[30187.394552]  ffff882c86217e08 ffffffff81188d11 ffff882c000000a9 00000007ffa00038
[30187.396893] Call Trace:
[30187.397618]  [<ffffffff81188d11>] handle_mm_fault+0xb1/0x160
[30187.470200]  [<ffffffff81767b77>] ? __do_page_fault+0x307/0x550
[30187.543316]  [<ffffffff81767a0f>] __do_page_fault+0x19f/0x550
[30187.616801]  [<ffffffff8111a98c>] ? acct_account_cputime+0x1c/0x20
[30187.691316]  [<ffffffff810a39c9>] ? account_user_time+0x99/0xb0
[30187.766416]  [<ffffffff810a3ffd>] ? vtime_account_user+0x5d/0x70
[30187.841869]  [<ffffffff81767dfe>] do_page_fault+0x3e/0x80
[30187.917650]  [<ffffffff81764048>] page_fault+0x28/0x30
[30187.993868] Code: e9 4e fd ff ff 48 89 da 4c 89 fe 4c 89 ef 44 89 4d c8 e8 b7 fb ff ff 85 c0 44 8b 4d c8 0f 85 2b ff ff ff 49 8b 3f e9 6b fd ff ff <0f> 0b 4c 89 f2 48 89 d9 4c 89 e6 4c 89 ef 44 89 4d c8 e8 0c c3 
[30188.157025] RIP  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30188.239203]  RSP <ffff882c86217d88>
[30188.485372] ---[ end trace 6994cfef24b33d0f ]---
[30214.336866] ------------[ cut here ]------------
[30214.421000] kernel BUG at /home/apw/COD/linux/mm/memory.c:3924!
[30214.506368] invalid opcode: 0000 [#2] SMP 
[30214.590801] Modules linked in: rpcsec_gss_krb5 ip6t_REJECT nfsv4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 nfsd dcdbas xt_conntrack auth_rpcgss ip6table_filter x86_pkg_temp_thermal ip6_tables intel_powerclamp nf_conntrack_netbios_ns nfs_acl nf_conntrack_broadcast nfs nf_nat_ftp coretemp nf_nat kvm_intel lockd nf_conntrack_ftp nf_conntrack kvm sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel fscache iptable_filter ip_tables aes_x86_64 x_tables lrw gf128mul glue_helper ablk_helper joydev cryptd mei_me sb_edac acpi_power_meter shpchp mei ipmi_si edac_core mac_hid lpc_ich wmi lp acpi_pad parport hid_generic ahci usbhid tg3 hid libahci ptp megaraid_sas pps_core
[30215.143610] CPU: 0 PID: 29215 Comm: java Tainted: G      D       3.15.0-031500rc4-generic #201405042135
[30215.241344] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS 2.2.2 01/16/2014
[30215.338045] task: ffff882ff75b4b60 ti: ffff881dff4ac000 task.ti: ffff881dff4ac000
[30215.433526] RIP: 0010:[<ffffffff81188c2d>]  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30215.528311] RSP: 0018:ffff881dff4add88  EFLAGS: 00010246
[30215.620726] RAX: 0000000000000100 RBX: 00000007ffa00000 RCX: 0000000000000000
[30215.712051] RDX: ffff882ff75b4b60 RSI: 0000000000000009 RDI: 80000001f5e009e6
[30215.801275] RBP: ffff881dff4addc8 R08: 0000000000000000 R09: 00000000000000a9
[30215.888117] R10: 0000000000000000 R11: 0000000000000002 R12: ffff881689222cf0
[30215.973856] R13: ffff882c268abb80 R14: ffff8817f5339fe8 R15: ffff882de1fb80f8
[30216.059090] FS:  00007f9cbabe2700(0000) GS:ffff88181fa00000(0000) knlGS:0000000000000000
[30216.144509] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30216.228998] CR2: 0000000001f47000 CR3: 00000021cc72f000 CR4: 00000000001407f0
[30216.314018] Stack:
[30216.398344]  ffff881dff4adf18 01400000000000a9 0000000000000001 ffff882ff75b4b60
[30216.485583]  ffff882c268abb80 ffff881689222cf0 ffff881689222cf0 ffff882c268abb80
[30216.573117]  ffff881dff4ade08 ffffffff81188d11 ffff881d000000a9 00000007ffa00000
[30216.660731] Call Trace:
[30216.746542]  [<ffffffff81188d11>] handle_mm_fault+0xb1/0x160
[30216.833030]  [<ffffffff81767a0f>] __do_page_fault+0x19f/0x550
[30216.918779]  [<ffffffff8101c6b5>] ? native_sched_clock+0x35/0x90
[30217.003879]  [<ffffffff8111a98c>] ? acct_account_cputime+0x1c/0x20
[30217.088237]  [<ffffffff810a39c9>] ? account_user_time+0x99/0xb0
[30217.172220]  [<ffffffff810a3ffd>] ? vtime_account_user+0x5d/0x70
[30217.255882]  [<ffffffff81767dfe>] do_page_fault+0x3e/0x80
[30217.338589]  [<ffffffff81764048>] page_fault+0x28/0x30
[30217.420746] Code: e9 4e fd ff ff 48 89 da 4c 89 fe 4c 89 ef 44 89 4d c8 e8 b7 fb ff ff 85 c0 44 8b 4d c8 0f 85 2b ff ff ff 49 8b 3f e9 6b fd ff ff <0f> 0b 4c 89 f2 48 89 d9 4c 89 e6 4c 89 ef 44 89 4d c8 e8 0c c3 
[30217.595468] RIP  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30217.681793]  RSP <ffff881dff4add88>
[30217.767749] ------------[ cut here ]------------
[30217.767786] ---[ end trace 6994cfef24b33d10 ]---
[30217.940566] kernel BUG at /home/apw/COD/linux/mm/memory.c:3924!
[30218.027846] invalid opcode: 0000 [#3] SMP 
[30218.113693] Modules linked in: rpcsec_gss_krb5 ip6t_REJECT nfsv4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 nfsd dcdbas xt_conntrack auth_rpcgss ip6table_filter x86_pkg_temp_thermal ip6_tables intel_powerclamp nf_conntrack_netbios_ns nfs_acl nf_conntrack_broadcast nfs nf_nat_ftp coretemp nf_nat kvm_intel lockd nf_conntrack_ftp nf_conntrack kvm sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel fscache iptable_filter ip_tables aes_x86_64 x_tables lrw gf128mul glue_helper ablk_helper joydev cryptd mei_me sb_edac acpi_power_meter shpchp mei ipmi_si edac_core mac_hid lpc_ich wmi lp acpi_pad parport hid_generic ahci usbhid tg3 hid libahci ptp megaraid_sas pps_core
[30218.672910] CPU: 4 PID: 29210 Comm: java Tainted: G      D       3.15.0-031500rc4-generic #201405042135
[30218.771443] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS 2.2.2 01/16/2014
[30218.868978] task: ffff882ffa2fb240 ti: ffff882afca9e000 task.ti: ffff882afca9e000
[30218.965197] RIP: 0010:[<ffffffff81188c2d>]  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30219.060779] RSP: 0018:ffff882afca9fd88  EFLAGS: 00010246
[30219.154021] RAX: 0000000000000100 RBX: 00000007ff628038 RCX: 0000000000000000
[30219.246223] RDX: ffff882ffa2fb240 RSI: 0000000000000009 RDI: 80000019b52009e6
[30219.336404] RBP: ffff882afca9fdc8 R08: 0000000000000000 R09: 00000000000000a9
[30219.424148] R10: 0000000000000000 R11: 0000000000000002 R12: ffff881689222cf0
[30219.510632] R13: ffff882c268abb80 R14: ffff8817f5339fd8 R15: ffff882de1fb80f8
[30219.596557] FS:  00007f9cbb0e7700(0000) GS:ffff88181fa40000(0000) knlGS:0000000000000000
[30219.682615] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30219.767742] CR2: 00007fa770000010 CR3: 00000021cc72f000 CR4: 00000000001407e0
[30219.853304] Stack:
[30219.937978]  ffff882afca9fdf8 ffffffff000000a9 0000000000000000 ffff882ffa2fb240
[30220.025610]  ffff882c268abb80 ffff881689222cf0 ffff881689222cf0 ffff882c268abb80
[30220.113422]  ffff882afca9fe08 ffffffff81188d11 ffff882a000000a9 00000007ff628038
[30220.201316] Call Trace:
[30220.287297]  [<ffffffff81188d11>] handle_mm_fault+0xb1/0x160
[30220.373955]  [<ffffffff81767a0f>] __do_page_fault+0x19f/0x550
[30220.459693]  [<ffffffff8111a98c>] ? acct_account_cputime+0x1c/0x20
[30220.544982]  [<ffffffff810a39c9>] ? account_user_time+0x99/0xb0
[30220.629282]  [<ffffffff810a3ffd>] ? vtime_account_user+0x5d/0x70
[30220.713079]  [<ffffffff81767dfe>] do_page_fault+0x3e/0x80
[30220.796399]  [<ffffffff81764048>] page_fault+0x28/0x30
[30220.878609] Code: e9 4e fd ff ff 48 89 da 4c 89 fe 4c 89 ef 44 89 4d c8 e8 b7 fb ff ff 85 c0 44 8b 4d c8 0f 85 2b ff ff ff 49 8b 3f e9 6b fd ff ff <0f> 0b 4c 89 f2 48 89 d9 4c 89 e6 4c 89 ef 44 89 4d c8 e8 0c c3 
[30221.052825] RIP  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30221.139123]  RSP <ffff882afca9fd88>
[30221.224946] ------------[ cut here ]------------
[30221.224987] ---[ end trace 6994cfef24b33d11 ]---
[30221.397605] kernel BUG at /home/apw/COD/linux/mm/memory.c:3924!
[30221.484679] invalid opcode: 0000 [#4] SMP 
[30221.571987] Modules linked in: rpcsec_gss_krb5 ip6t_REJECT nfsv4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 nfsd dcdbas xt_conntrack auth_rpcgss ip6table_filter x86_pkg_temp_thermal ip6_tables intel_powerclamp nf_conntrack_netbios_ns nfs_acl nf_conntrack_broadcast nfs nf_nat_ftp coretemp nf_nat kvm_intel lockd nf_conntrack_ftp nf_conntrack kvm sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel fscache iptable_filter ip_tables aes_x86_64 x_tables lrw gf128mul glue_helper ablk_helper joydev cryptd mei_me sb_edac acpi_power_meter shpchp mei ipmi_si edac_core mac_hid lpc_ich wmi lp acpi_pad parport hid_generic ahci usbhid tg3 hid libahci ptp megaraid_sas pps_core
[30222.134461] CPU: 36 PID: 29206 Comm: java Tainted: G      D       3.15.0-031500rc4-generic #201405042135
[30222.233604] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS 2.2.2 01/16/2014
[30222.333043] task: ffff882ff821cb60 ti: ffff881b45ec8000 task.ti: ffff881b45ec8000
[30222.431478] RIP: 0010:[<ffffffff81188c2d>]  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30222.529349] RSP: 0018:ffff881b45ec9d88  EFLAGS: 00010246
[30222.624956] RAX: 0000000000000100 RBX: 00000007ff600000 RCX: 0000000000000000
[30222.719473] RDX: ffff882ff821cb60 RSI: 0000000000000009 RDI: 80000019b52009e6
[30222.811783] RBP: ffff881b45ec9dc8 R08: 0000000000000000 R09: 00000000000000a9
[30222.901854] R10: 0000000000000000 R11: 0000000000000002 R12: ffff881689222cf0
[30222.989761] R13: ffff882c268abb80 R14: ffff8817f5339fd8 R15: ffff882de1fb80f8
[30223.076551] FS:  00007f9cbb4eb700(0000) GS:ffff88181fc40000(0000) knlGS:0000000000000000
[30223.163766] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30223.249863] CR2: 00000007ff810030 CR3: 00000021cc72f000 CR4: 00000000001407e0
[30223.336014] Stack:
[30223.420742]  0000000000000001 01400000000000a9 000000000000000f ffff882ff821cb60
[30223.508043]  ffff882c268abb80 ffff881689222cf0 ffff881689222cf0 ffff882c268abb80
[30223.596168]  ffff881b45ec9e08 ffffffff81188d11 ffff881b000000a9 00000007ff600000
[30223.621243] ------------[ cut here ]------------
[30223.621247] WARNING: CPU: 12 PID: 29190 at /home/apw/COD/linux/kernel/watchdog.c:249 watchdog_overflow_callback+0x98/0xc0()
[30223.621248] Watchdog detected hard LOCKUP on cpu 12
[30223.621269] Modules linked in: rpcsec_gss_krb5 ip6t_REJECT nfsv4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 nfsd dcdbas xt_conntrack auth_rpcgss ip6table_filter x86_pkg_temp_thermal ip6_tables intel_powerclamp nf_conntrack_netbios_ns nfs_acl nf_conntrack_broadcast nfs nf_nat_ftp coretemp nf_nat kvm_intel lockd nf_conntrack_ftp nf_conntrack kvm sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel fscache iptable_filter ip_tables aes_x86_64 x_tables lrw gf128mul glue_helper ablk_helper joydev cryptd mei_me sb_edac acpi_power_meter shpchp mei ipmi_si edac_core mac_hid lpc_ich wmi lp acpi_pad parport hid_generic ahci usbhid tg3 hid libahci ptp megaraid_sas pps_core
[30223.621271] CPU: 12 PID: 29190 Comm: java Tainted: G      D       3.15.0-031500rc4-generic #201405042135
[30223.621272] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS 2.2.2 01/16/2014
[30223.621274]  00000000000000f9 ffff88181fac7ba8 ffffffff81756be4 0000000000000082
[30223.621275]  ffff88181fac7bf8 ffff88181fac7be8 ffffffff8106b89c 0000000000000000
[30223.621276]  ffff8817fa550000 0000000000000000 ffff88181fac7d18 0000000000000000
[30223.621277] Call Trace:
[30223.621282]  <NMI>  [<ffffffff81756be4>] dump_stack+0x46/0x58
[30223.621284]  [<ffffffff8106b89c>] warn_slowpath_common+0x8c/0xc0
[30223.621285]  [<ffffffff8106b986>] warn_slowpath_fmt+0x46/0x50
[30223.621287]  [<ffffffff811166d8>] watchdog_overflow_callback+0x98/0xc0
[30223.621290]  [<ffffffff811520c8>] __perf_event_overflow+0x98/0x230
[30223.621294]  [<ffffffff8102a2c8>] ? x86_perf_event_set_period+0xd8/0x150
[30223.621295]  [<ffffffff811529b4>] perf_event_overflow+0x14/0x20
[30223.621298]  [<ffffffff81031ce1>] intel_pmu_handle_irq+0x1c1/0x2b0
[30223.621300]  [<ffffffff811970a1>] ? unmap_kernel_range_noflush+0x11/0x20
[30223.621304]  [<ffffffff814319db>] ? ghes_copy_tofrom_phys+0x10b/0x200
[30223.621306]  [<ffffffff817657e4>] perf_event_nmi_handler+0x34/0x60
[30223.621308]  [<ffffffff81764f97>] nmi_handle.isra.6+0x87/0x140
[30223.621310]  [<ffffffff81432a10>] ? ghes_print_estatus.constprop.10+0x70/0x70
[30223.621312]  [<ffffffff81765138>] default_do_nmi+0x58/0x240
[30223.621313]  [<ffffffff817653b0>] do_nmi+0x90/0xd0
[30223.621315]  [<ffffffff817643b1>] end_repeat_nmi+0x1e/0x2e
[30223.621317]  [<ffffffff81764c9a>] ? oops_begin+0xca/0xf0
[30223.621318]  [<ffffffff81764c9a>] ? oops_begin+0xca/0xf0
[30223.621320]  [<ffffffff81764c9a>] ? oops_begin+0xca/0xf0
[30223.621322]  <<EOE>>  [<ffffffff8101775b>] die+0x2b/0x90
[30223.621324]  [<ffffffff8176450b>] do_trap+0xcb/0x170
[30223.621327]  [<ffffffff810145ec>] do_invalid_op+0xac/0x110
[30223.621328]  [<ffffffff81188c2d>] ? __handle_mm_fault+0x32d/0x360
[30223.621331]  [<ffffffff8176e65e>] invalid_op+0x1e/0x30
[30223.621333]  [<ffffffff81188c2d>] ? __handle_mm_fault+0x32d/0x360
[30223.621335]  [<ffffffff81188a35>] ? __handle_mm_fault+0x135/0x360
[30223.621336]  [<ffffffff81188d11>] handle_mm_fault+0xb1/0x160
[30223.621338]  [<ffffffff81767b77>] ? __do_page_fault+0x307/0x550
[30223.621340]  [<ffffffff81767a0f>] __do_page_fault+0x19f/0x550
[30223.621341]  [<ffffffff8111a98c>] ? acct_account_cputime+0x1c/0x20
[30223.621343]  [<ffffffff810a39c9>] ? account_user_time+0x99/0xb0
[30223.621345]  [<ffffffff810a3ffd>] ? vtime_account_user+0x5d/0x70
[30223.621347]  [<ffffffff81767dfe>] do_page_fault+0x3e/0x80
[30223.621348]  [<ffffffff81764048>] page_fault+0x28/0x30
[30223.621349] ---[ end trace 6994cfef24b33d12 ]---
[30228.109878] Call Trace:
[30228.172131]  [<ffffffff81188d11>] handle_mm_fault+0xb1/0x160
[30228.233612]  [<ffffffff81767a0f>] __do_page_fault+0x19f/0x550
[30228.292696]  [<ffffffff8111a98c>] ? acct_account_cputime+0x1c/0x20
[30228.349825]  [<ffffffff810a39c9>] ? account_user_time+0x99/0xb0
[30228.404491]  [<ffffffff810a3ffd>] ? vtime_account_user+0x5d/0x70
[30228.457003]  [<ffffffff81767dfe>] do_page_fault+0x3e/0x80
[30228.507096]  [<ffffffff81764048>] page_fault+0x28/0x30
[30228.555589] Code: e9 4e fd ff ff 48 89 da 4c 89 fe 4c 89 ef 44 89 4d c8 e8 b7 fb ff ff 85 c0 44 8b 4d c8 0f 85 2b ff ff ff 49 8b 3f e9 6b fd ff ff <0f> 0b 4c 89 f2 48 89 d9 4c 89 e6 4c 89 ef 44 89 4d c8 e8 0c c3 
[30228.661865] RIP  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30228.714324]  RSP <ffff881b45ec9d88>
[30228.765619] ------------[ cut here ]------------
[30228.765694] ---[ end trace 6994cfef24b33d13 ]---
[30228.767565] [sched_delayed] sched: RT throttling activated
[30228.919922] kernel BUG at /home/apw/COD/linux/mm/memory.c:3924!
[30228.972355] invalid opcode: 0000 [#5] SMP 
[30229.024761] Modules linked in: rpcsec_gss_krb5 ip6t_REJECT nfsv4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 nfsd dcdbas xt_conntrack auth_rpcgss ip6table_filter x86_pkg_temp_thermal ip6_tables intel_powerclamp nf_conntrack_netbios_ns nfs_acl nf_conntrack_broadcast nfs nf_nat_ftp coretemp nf_nat kvm_intel lockd nf_conntrack_ftp nf_conntrack kvm sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel fscache iptable_filter ip_tables aes_x86_64 x_tables lrw gf128mul glue_helper ablk_helper joydev cryptd mei_me sb_edac acpi_power_meter shpchp mei ipmi_si edac_core mac_hid lpc_ich wmi lp acpi_pad parport hid_generic ahci usbhid tg3 hid libahci ptp megaraid_sas pps_core
[30229.397995] CPU: 12 PID: 29190 Comm: java Tainted: G      D W     3.15.0-031500rc4-generic #201405042135
[30229.466626] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS 2.2.2 01/16/2014
[30229.535738] task: ffff8817ef451920 ti: ffff8817f5ec4000 task.ti: ffff8817f5ec4000
[30229.605409] RIP: 0010:[<ffffffff81188c2d>]  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30229.676996] RSP: 0000:ffff8817f5ec5d88  EFLAGS: 00010246
[30229.748796] RAX: 0000000000000100 RBX: 00000007ffa08020 RCX: 0000000000000000
[30229.821952] RDX: ffff8817ef451920 RSI: 0000000000000009 RDI: 80000001f5e009e6
[30229.896234] RBP: ffff8817f5ec5dc8 R08: 0000000000000000 R09: 00000000000000a9
[30229.971146] R10: 0000000000000000 R11: 0000000000000006 R12: ffff881689222cf0
[30230.046324] R13: ffff882c268abb80 R14: ffff8817f5339fe8 R15: ffff882de1fb80f8
[30230.121769] FS:  00007f9cc424d700(0000) GS:ffff88181fac0000(0000) knlGS:0000000000000000
[30230.198862] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30230.276008] CR2: 00007f2494001000 CR3: 00000021cc72f000 CR4: 00000000001407e0
[30230.354260] Stack:
[30230.431662]  ffff8817f5ec5f18 ffffffff000000a9 0000000000000001 ffff8817ef451920
[30230.512595]  ffff882c268abb80 ffff881689222cf0 ffff881689222cf0 ffff882c268abb80
[30230.594011]  ffff8817f5ec5e08 ffffffff81188d11 ffff8817000000a9 00000007ffa08020
[30230.675852] Call Trace:
[30230.756953]  [<ffffffff81188d11>] handle_mm_fault+0xb1/0x160
[30230.839405]  [<ffffffff81767b77>] ? __do_page_fault+0x307/0x550
[30230.922083]  [<ffffffff81767a0f>] __do_page_fault+0x19f/0x550
[30231.004898]  [<ffffffff8111a98c>] ? acct_account_cputime+0x1c/0x20
[30231.088139]  [<ffffffff810a39c9>] ? account_user_time+0x99/0xb0
[30231.171509]  [<ffffffff810a3ffd>] ? vtime_account_user+0x5d/0x70
[30231.254694]  [<ffffffff81767dfe>] do_page_fault+0x3e/0x80
[30231.337728]  [<ffffffff81764048>] page_fault+0x28/0x30
[30231.420087] Code: e9 4e fd ff ff 48 89 da 4c 89 fe 4c 89 ef 44 89 4d c8 e8 b7 fb ff ff 85 c0 44 8b 4d c8 0f 85 2b ff ff ff 49 8b 3f e9 6b fd ff ff <0f> 0b 4c 89 f2 48 89 d9 4c 89 e6 4c 89 ef 44 89 4d c8 e8 0c c3 
[30231.595689] RIP  [<ffffffff81188c2d>] __handle_mm_fault+0x32d/0x360
[30231.682545]  RSP <ffff8817f5ec5d88>
[30231.769104] ---[ end trace 6994cfef24b33d14 ]---


** Changed in: linux (Ubuntu)
       Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1315736

Title:
  Machine Check Exception

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell
  support instructed to run DSET and BIOS hardware diagnostics. Neither
  of the tools showed any errors. Dell support said that if there was a
  hardware error it would have been shown on Dell logs and the probable
  reason for the dmesg log is a bug in ubuntu kernel MCE reporting.

  So, is it that following dmesg is because of a kernel bug in ubuntu
  14.04 server?

  [11562.171040] Please check user daemon is running.
  [94953.306404] sbridge: HANDLING MCE MEMORY ERROR
  [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c00004b000800c0
  [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 90000800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20
  [94953.306422] sbridge: HANDLING MCE MEMORY ERROR
  [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c000050000800c1
  [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 90000000000208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20
  [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0)
  [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0)

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-3.13.0-24-generic 3.13.0-24.46
  ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
  Uname: Linux 3.13.0-24-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 touko  2 19:15 seq
   crw-rw---- 1 root audio 116, 33 touko  2 19:15 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied
  Date: Sat May  3 21:52:07 2014
  InstallationDate: Installed on 2014-02-26 (66 days ago)
  InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219)
  MachineType: Dell Inc. PowerEdge R720
  PciMultimedia:
   
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  WifiSyslog:
   
  dmi.bios.date: 01/16/2014
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.2.2
  dmi.board.name: 0DCWD1
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A01
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R720
  dmi.sys.vendor: Dell Inc.
  --- 
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 touko  2 19:15 seq
   crw-rw---- 1 root audio 116, 33 touko  2 19:15 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  CurrentDmesg:
   Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied
   dmesg: write failed: Broken pipe
  DistroRelease: Ubuntu 14.04
  InstallationDate: Installed on 2014-02-26 (66 days ago)
  InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219)
  MachineType: Dell Inc. PowerEdge R720
  Package: linux (not installed)
  PciMultimedia:
   
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro
  ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty
  Uname: Linux 3.13.0-24-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  WifiSyslog:
   
  _MarkForUpload: True
  dmi.bios.date: 01/16/2014
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.2.2
  dmi.board.name: 0DCWD1
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A01
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R720
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions


References