← Back to team overview

kernel-packages team mailing list archive

[Bug 1398497] Re: HP Proliant Serverrs - DL360 and DL380 Gen8 - Precise Kernel Panic - General Protection Fault

 

Commits touching cmpxchg_double (last function in the instruction
pointer):

# NOT PRESENT INTO KERNEL 3.2

0aa9a13d80bae1bb24956f6e3e2662b7242e0b41 mm, slub: fix some indenting in cmpxchg_double_slab()
b1d6b40cbd0d6ff475b6a0a7a807a1e3bee7c033 s390/cmpxchg,percpu: implement cmpxchg_double()

*** d24ac77f71ded6a013bacb09f359eac0b0f29a80 slub: use __cmpxchg_double_slab() at interrupt disabled place
*** cdcd629869fabcd38ebd24a03b0a05ec1cbcafb0 x86: Fix and improve cmpxchg_double{,_local}()
|__> !!!!!!!!!!!!!!!! fix several problems related to cmpxchg and 64bits !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -> NEXT STEP ?

# PRESENT INTO KERNEL 3.2

1d07171c5e58e68a76a141970a3a5e816a414ce6 slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock
b789ef518b2a7231b0668c813f677cee528a9d3f slub: Add cmpxchg_double_slab()
3824abd1279ef75f791c43a6b1e3162ae0692b42 x86: Add support for cmpxchg_double
d4d84fef6d0366b585b7de13527a0faeca84d9ce slub: always align cpu_slab to honor cmpxchg_double requirement
d7c3f8cee81f4548de0513403b74131aee655576 percpu: Omit segment prefix in the UP case for cmpxchg_double
4fdccdfbb4652a7bbac8adbce7449eb093775118 slub: Add statistics for this_cmpxchg_double failures
b9ec40af0e18fb7d02106be148036c2ea490fdf9 percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
7c3343392172ba98d9d90a83edcc4c2e80897009 percpu: Generic support for this_cpu_cmpxchg_double()

It might be needed for us to backport commit
"cdcd629869fabcd38ebd24a03b0a05ec1cbcafb0" to 3.2:

commit cdcd629869fabcd38ebd24a03b0a05ec1cbcafb0
Author: Jan Beulich <JBeulich@xxxxxxxx>
Date:   Mon Jan 2 17:02:18 2012 +0000

    x86: Fix and improve cmpxchg_double{,_local}()
    
    Just like the per-CPU ones they had several
    problems/shortcomings:
    
    Only the first memory operand was mentioned in the asm()
    operands, and the 2x64-bit version didn't have a memory clobber
    while the 2x32-bit one did. The former allowed the compiler to
    not recognize the need to re-load the data in case it had it
    cached in some register, while the latter was overly
    destructive.
    
    The types of the local copies of the old and new values were
    incorrect (the types of the pointed-to variables should be used
    here, to make sure the respective old/new variable types are
    compatible).
    
    The __dummy/__junk variables were pointless, given that local
    copies of the inputs already existed (and can hence be used for
    discarded outputs).
    
    The 32-bit variant of cmpxchg_double_local() referenced
    cmpxchg16b_local().
    
    At once also:
    
     - change the return value type to what it really is: 'bool'
     - unify 32- and 64-bit variants
     - abstract out the common part of the 'normal' and 'local' variants
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Cc: Christoph Lameter <cl@xxxxxxxxx>
    Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
    Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
    Link: http://lkml.kernel.org/r/4F01F12A020000780006A19B@xxxxxxxxxxxxxxxxxxxx
    Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

in order to prevent this "CPU General Failure" to happen.

Thanks

Rafael Tinoco

** Tags added: cts precise

** Changed in: linux (Ubuntu)
     Assignee: (unassigned) => Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1398497

Title:
  HP Proliant Serverrs - DL360 and DL380 Gen8 - Precise Kernel Panic -
  General Protection Fault

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Precise:
  Incomplete

Bug description:
  It was brought to my attention the following situation:

  """
  We massively upgraded our Ubuntu 12.04 servers (most of them are HP
  DL360p Gen8 or DL380 Gen8) to 3.2.0-67 kernel And in the last 2-3
  days we already had to reboot 5 of them because they completely hang

  Some of them had the following messages under syslog :
  kernel: [384707.675479] general protection fault: 0000 [#5666] SMP

  others had :
  kernel: [950725.612724] BUG: unable to handle kernel paging request

  All of them have this also :
  your BIOS is broken and requested that x2apic be disabled
  """

  Comments bellow

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1398497/+subscriptions


References