← Back to team overview

kernel-packages team mailing list archive

[Bug 588993] Re: mcelog does not work due to lack of kernel support

 

For the record, first ever CE reported by mcelog below on the hardware I
have access to. No UCEs nor even repeating CEs seen. Have not tested
mce-inject and/or apei/einj.ko kernel module.

mcelog: failed to prefill DIMM database from DMI data
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 0 
TIME 1422983269 Tue Feb  3 19:07:49 2015
MCG status:
MCi status:
Corrected error
Error enabled
MCA: Internal parity error
STATUS 90000040000f0005 MCGSTATUS 0
MCGCAP c09 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 60

Syslog entries for mcelog startup on up-to-date trusty do still mention that page-offlining is unsupported:
Feb 19 08:26:04 gx1 mcelog: failed to prefill DIMM database from DMI data
Feb 19 08:26:04 gx1 mcelog: Kernel does not support page offline interface


** Tags added: trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/588993

Title:
  mcelog does not work due to lack of kernel support

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  mcelog does not seem to be happy with the current Ubuntu kernel. This
  is all I get in /var/log/syslog:

  # grep mce /var/log/syslog
  Jun  2 18:14:15 xxxxx mcelog: failed to prefill DIMM database from DMI data
  Jun  2 18:14:15 xxxxx mcelog: Kernel does not support page offline interface

  I can disable memory database prefill option in
  /etc/mcelog/mcelog.conf which silences the first error message.
  However, dmidecode seems to work just fine and produces reasonably
  looking output so I am not sure why prefill option of mcelog fails.
  Also, mcelog never outputs anything so I suspect it is not functional.
  I have a cluster of identical hardware and I can simulate a constant
  stream of MCE errors by misconfiguring memory settings in BIOS. The
  errors are successfully fetched and reported by mcelog on the nodes
  with CentOS 5 installed (kernel 2.6.18-164.15.1.el5). Nothing is
  reported under Ubuntu Server 10.04 LTS on this same hardware.

  # lsb_release -rd
  Description:    Ubuntu 10.04 LTS
  Release:        10.04

  # uname -a
  Linux xxxxx 2.6.32-22-server #33-Ubuntu SMP Wed Apr 28 14:34:48 UTC 2010 x86_64 GNU/Linux

  # apt-cache policy mcelog
  mcelog:
    Installed: 1.0~pre3-1
    Candidate: 1.0~pre3-1
    Version table:
   *** 1.0~pre3-1 0
          500 http://us.archive.ubuntu.com/ubuntu/ lucid/universe Packages
          100 /var/lib/dpkg/status

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/588993/+subscriptions