← Back to team overview

kernel-packages team mailing list archive

[Bug 1370425] Re: kernel bug seen while try to use madvise system call with MADV_HWPOISON mode

 

I've submitted two patches for this:

http://patchwork.ozlabs.org/patch/392712/
http://patchwork.ozlabs.org/patch/392713/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1370425

Title:
  kernel bug seen while try to use madvise system call with
  MADV_HWPOISON mode

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  Problem Description
  ====================
  kernel bug seen while try to use madvise system call with MADV_HWPOISON mode
   
  ---uname output---
  Linux u10thp 3.16.0-9-generic #14-Ubuntu SMP Fri Aug 15 15:03:36 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Power 8 
   
  Steps to Reproduce
  ====================
  1.  Install Ubuntu 14.10 guest on PowerKVM.
  2.  Setup hugepage backing guest VM.
  3.  Try madv_poison.c code to test madvise sys. call with HWPOISON mode(test code is attached).
  	gcc -o madv_poison madv_poison.c
  	./madv_poison -C -i 1 		(1 - shm_test)

  Ubuntu 14.10 LE throws kernel bug :
  root@u10thp:~# ./madv_poison -C -i 1
  vm.memory_failure_early_kill = 0
  [pid 2301] start page-poisoning test
  [pid 2301] there are 1 shm_child
  [pid 2301] have spawned 1 processes
  [pid 2301] wait for Pid 2304
  [pid 2304] shm dirty poisoning page 0x3fffa7ce0000
  [ 7905.009001] Injecting memory failure for page 0xe6a7 at 0x3fffa7ce0000
  [ 7905.009359] MCE 0xe6a7: dirty LRU page recovery: Recovered
  [pid 2304] writing 2
  [ 7905.009901] ------------[ cut here ]------------
  [ 7905.010164] kernel BUG at /build/buildd/linux-3.16.0/arch/powerpc/mm/fault.c:180!
  [ 7905.010396] Oops: Exception in kernel mode, sig: 5 [#234]
  [ 7905.010438] SMP NR_CPUS=2048 NUMA pSeries
  [ 7905.010480] Modules linked in: pseries_rng rtc_generic ohci_pci
  [ 7905.010614] CPU: 0 PID: 2304 Comm: madv_poison Tainted: G      D       3.16.0-9-generic #14-Ubuntu
  [ 7905.010686] task: c0000000e0a92a60 ti: c0000000e09e8000 task.ti: c0000000e09e8000
  [ 7905.010746] NIP: c0000000009e3314 LR: c0000000009e2e54 CTR: 0000000000000000
  [ 7905.010864] REGS: c0000000e09eb990 TRAP: 0700   Tainted: G      D        (3.16.0-9-generic)
  [ 7905.010924] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28002882  XER: 00000000
  [ 7905.011125] CFAR: c0000000009e3170 SOFTE: 1 
  GPR00: c0000000009e2e54 c0000000e09ebc10 c0000000013742e0 0000000000000010 
  GPR04: c0000000e0b37ff8 00003fffa7ce0000 00000000000000a9 0000000000000000 
  GPR08: 0000000000000000 0000000000000010 c0000000e0a92a60 0000000000000020 
  GPR12: 0000000048002884 c00000000fe40000 0000000000000000 0000000000000000 
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
  GPR20: 00000000000000a9 0000000000000000 c0000000e0597a40 c0000000e022b060 
  GPR24: 0000000000000010 c0000000e022b000 c000000000009568 00003fffa7ce0000 
  GPR28: 0000000000000000 0000000000000000 0000000002000000 c0000000e09ebea0 
  [ 7905.012189] NIP [c0000000009e3314] do_page_fault+0x984/0x990
  [ 7905.012241] LR [c0000000009e2e54] do_page_fault+0x4c4/0x990
  [ 7905.012281] Call Trace:
  [ 7905.012361] [c0000000e09ebc10] [c0000000009e2e54] do_page_fault+0x4c4/0x990 (unreliable)
  [ 7905.012434] [c0000000e09ebe30] [c000000000009568] handle_page_fault+0x10/0x30
  [ 7905.012494] Instruction dump:
  [ 7905.012580] e92d0290 e8690460 38630060 4b7274d9 60000000 e93f0108 3bc00000 792a97e3 
  [ 7905.012683] 4082f77c 3bc00009 60000000 4bfff774 <0fe00000> 00000000 00000000 3c4c0099 
  [ 7905.012845] ---[ end trace a48a199a061eed79 ]---
  [ 7905.019084] 
  [pid 2301] Ins 0: Pid 2304: failed - shared memory test
  [pid 2301] 	!!! Page Poisoning Test is FAILED (1 failures found). !!!

  [pid 2301] page-poisoning test done!
  root@u10thp:~# 

  == Comment: #1 - Kalpana Shetty <kalshett@xxxxxxxxxx> -  ==
  The test code works fine with x86/Ubuntu VM so if it is not supported on power then it should have thrown an error not supported as it does with PowerKVM / RHEL 7 VM.

  Intel/Ubuntu 14.04 VM:			=================================> Working fine.
  root@u04vm14:~# ./madv_poison -C -i 1               (shm_test case)
  vm.memory_failure_early_kill = 0
  [pid 7325] start page-poisoning test
  [pid 7325] there are 1 shm_child
  [pid 7325] have spawned 1 processes
  [pid 7325] wait for Pid 7328
  [pid 7328] shm dirty poisoning page 0x7f60ca8ea000
  [pid 7328] writing 2
  [pid 7328] signal 7 code 4 addr 0x7f60ca8ea000
  [pid 7328] pass: recovered
  [pid 7325] Ins 0: Pid 7328: pass - shared memory test
  [pid 7325] 	!!! Page Poisoning Test got PASS. !!!

  [pid 7325] page-poisoning test done!

  PowerKVM / RHEL 7 VM:
  [root@rhel7-web-VM1 ~]# ./madv_poison -C -i 1
  sysctl: cannot stat /proc/sys/vm/memory_failure_early_kill: No such file or directory
  [pid 11512] start page-poisoning test
  [pid 11512] there are 1 shm_child
  [pid 11512] have spawned 1 processes
  [pid 11514] shm dirty poisoning page 0x3fff84d60000
  [pid 11512] wait for Pid 11514
  [pid 11514] failed: Kernel doesn't support poison injection ============================> unsupported error.
  [pid 11512] Ins 0: Pid 11514: failed - shared memory test
  [pid 11512] 	!!! Page Poisoning Test is FAILED (1 failures found). !!!

  [pid 11512] page-poisoning test done!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1370425/+subscriptions