← Back to team overview

kernel-packages team mailing list archive

[Bug 1354024] Comment bridged from LTC Bugzilla

 

------- Comment From chavez@xxxxxxxxxx 2015-02-05 22:52 EDT-------
(In reply to comment #19)
> Can you re-test with the latest 3.16 kernel to see if this is still an issue?
> Thanks,

I again ran the ycsb workload on mongodb after dist upgrading to later
kernel version and not seen any issues.

uname on mongo and ycsb VMs:
root@mongosrv:~# uname -a
Linux mongosrv 3.16.0-17-generic #23-Ubuntu SMP Fri Sep 19 16:54:14 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux

root@ycsb:~# uname -a
Linux ycsb 3.16.0-17-generic #23-Ubuntu SMP Fri Sep 19 16:54:14 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1354024

Title:
  Running YCSB workload on MongoDB on Ubuntu 14.10 VM resulted in kernel
  bug

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Kalpana Shetty <kalshett@xxxxxxxxxx> - 2014-08-05 23:53:28 ==
  ---Problem Description---
  Running YCSB workload on LongoDB on Ubuntu 14.10 VM resulted in kernel bug
   
  ---uname output---
  root@u10vm15:~# uname -a Linux u10vm15 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Power 8 - Tuleta 

   Machine Type = POWER 8 
   
  ---System Hang---
   Ubuntu 14.10 LE guest needs to be restarted when seen this issue.
   
   Steps to reproduce:
  - Install Ubuntu 14.10 on 2 VMs(July 30th build)
  - Run Mongodb 2.6.2 on one of PowerKVM VM 
  - Run YCSB 0.1.4 on other VM
  - Create 1million record load on MongoDB using YCSB;  allow it to run for 4 to 5 hours or so. 

  Setup details:
  - MongoDB server on one VM (version: 2.6.2)
  - YCSB workload running on one VM (YCSB version - ycsb-0.1.4)

  uname on Host:
  [root@powerkvm5-lp1 ~]# uname -a
  Linux powerkvm5-lp1.austin.ibm.com 3.10.42-2004.pkvm2_1_1.8.ppc64 #1 SMP Fri Jul 18 11:20:03 CDT 2014 ppc64 ppc64 ppc64 GNU/Linux

  uname on Guest OS:
  root@u10vm15:~# uname -a
  Linux u10vm15 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux

  [23001.071911] ------------[ cut here ]------------
  [23001.071922] kernel BUG at /build/buildd/linux-3.16.0/fs/dcache.c:1626!
  [23001.072917] Oops: Exception in kernel mode, sig: 5 [#1]
  [23001.073620] SMP NR_CPUS=2048 NUMA pSeries
  [23001.074149] Modules linked in: pseries_rng rtc_generic ohci_pci
  [23001.075162] CPU: 8 PID: 3384 Comm: updatedb.mlocat Not tainted 3.16.0-6-generic #11-Ubuntu
  [23001.076006] task: c000000006e00000 ti: c000000130364000 task.ti: c000000130364000
  [23001.076834] NIP: c0000000002abc68 LR: c0000000002abf90 CTR: c00000000001f880
  [23001.077650] REGS: c0000001303676d0 TRAP: 0700   Not tainted  (3.16.0-6-generic)
  [23001.078468] MSR: 8000000100029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004842  XER: 20000000
  [23001.080432] CFAR: c0000000002abf8c SOFTE: 1 
  [23001.080432] GPR00: c0000000002abf90 c000000130367950 c000000001346618 c000000005dd0000 
  [23001.080432] GPR04: 0000000000000000 0000000000001000 c000000005dcd170 0000000000000fcc 
  [23001.080432] GPR08: 0000000000001000 0000000000000001 8803dabf05ffffff 000000000016eb0c 
  [23001.080432] GPR12: 0000000000004400 c00000000fe41c00 0000000000000000 0000000000100000 
  [23001.080432] GPR16: 000000001001d660 0000010034e94ec0 0000000000000001 0000000053d94034 
  [23001.080432] GPR20: 0000000000000000 0000000000000001 00003fffcaa1efb8 0000010034e842e0 
  [23001.080432] GPR24: 0000000000000000 0000000000000000 0000010034e94ec0 ffffffffffffff9c 
  [23001.080432] GPR28: 0000000000000040 0000000000000000 c000000005dd0000 0000000000000000 
  [23001.091266] NIP [c0000000002abc68] d_instantiate+0x38/0xf0
  [23001.091837] LR [c0000000002abf90] d_splice_alias+0x60/0x1a0
  [23001.092404] Call Trace:
  [23001.092692] [c000000130367980] [c0000000002abf90] d_splice_alias+0x60/0x1a0
  [23001.093544] [c0000001303679c0] [c00000000034c5b4] ext4_lookup+0xc4/0x1c0
  [23001.094399] [c000000130367a50] [c000000000299944] lookup_real+0x64/0xc0
  [23001.095261] [c000000130367a90] [c00000000029a790] __lookup_hash+0x60/0x80
  [23001.096106] [c000000130367ae0] [c00000000029d610] lookup_slow+0x70/0x110
  [23001.096946] [c000000130367b20] [c00000000029ea08] path_lookupat+0x958/0x9a0
  [23001.097804] [c000000130367be0] [c00000000029eaa8] filename_lookup+0x58/0x140
  [23001.098648] [c000000130367c30] [c0000000002a2524] user_path_at_empty+0x84/0xe0
  [23001.099580] [c000000130367d20] [c0000000002937e4] vfs_fstatat+0x84/0x140
  [23001.100432] [c000000130367d80] [c000000000293eb4] SyS_newlstat+0x34/0x60
  [23001.101378] [c000000130367e30] [c00000000000a0fc] syscall_exit+0x0/0x7c
  [23001.102193] Instruction dump:
  [23001.102589] 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ffd1 7c7e1b78 7c9f2378 60000000 
  [23001.103945] 60000000 e93e00b8 3149ffff 7d2a4910 <0b090000> 2fbf0000 419e0060 387f0088 
  [23001.105276] ---[ end trace b20dd6fbb5b21932 ]---
  [23001.118598] 
  root@u10vm15:~# 

  After I rebooted I'm keep seeing below call traces:
  Ubuntu Utopic Unicorn (development branch) u10vm15 hvc0

  u10vm15 login: root
  Password: 
  Last login: Wed Aug  6 00:02:18 IST 2014 on hvc0
  Welcome to Ubuntu Utopic Unicorn (development branch) (GNU/Linux 3.16.0-6-generic ppc64le)

   * Documentation:  https://help.ubuntu.com/
  [32950.678160] systemd-logind[1071]: Removed session c1.
  [32950.694697] systemd-logind[1071]: New session c2 of user root.
  [32950.703411] Unable to handle kernel paging request for data at address 0x2f0000000000000
  [32950.704886] Faulting instruction address: 0xc000000000260290
  [32950.706148] Oops: Kernel access of bad area, sig: 11 [#2]
  [32950.707098] SMP NR_CPUS=2048 NUMA pSeries
  [32950.708098] Modules linked in: pseries_rng rtc_generic ohci_pci
  [32950.709651] CPU: 8 PID: 342 Comm: cgmanager Tainted: G      D       3.16.0-6-generic #11-Ubuntu
  [32950.711433] task: c00000012e3854c0 ti: c00000012e410000 task.ti: c00000012e410000
  [32950.712938] NIP: c000000000260290 LR: c000000000260384 CTR: c0000000004099f0
  [32950.715490] REGS: c00000012e413970 TRAP: 0300   Tainted: G      D        (3.16.0-6-generic)
  [32950.718098] MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28002448  XER: 00000000
  [32950.723688] CFAR: c000000000010a30 DAR: 02f0000000000000 DSISR: 40000000 SOFTE: 1 
  GPR00: c000000000260384 c00000012e413bf0 c000000001346618 0000000000000000 
  GPR04: 00000000000000d0 c0000000012a4ad0 0000000000000001 c000000001919db0 
  GPR08: 0000000000000e90 0000000000000000 0000000000b30000 00065581e050bfe3 
  GPR12: c0000000004099f0 c00000000fe41c00 fffffffffffffe80 fffffffffffffe90 
  GPR16: fffffffffffffea0 fffffffffffffeb0 fffffffffffffec0 fffffffffffffed0 
  GPR20: fffffffffffffee0 fffffffffffffef0 ffffffffffffff00 ffffffffffffff10 
  GPR24: ffffffffffffff20 00003ffff2b48de0 c00000013604c600 00003ffff2b48b18 
  GPR28: c0000000002adc8c 00000000000000d0 02f0000000000000 c00000013604c600 
  [32950.742460] NIP [c000000000260290] kmem_cache_alloc+0x90/0x2d0
  [32950.743997] LR [c000000000260384] kmem_cache_alloc+0x184/0x2d0
  [32950.745105] Call Trace:
  [32950.745637] [c00000012e413bf0] [c000000000260384] kmem_cache_alloc+0x184/0x2d0 (unreliable)
  [32950.747609] [c00000012e413c40] [c0000000002adc8c] __d_alloc+0x4c/0x1c0
  [32950.749004] [c00000012e413c80] [c00000000083bd58] sock_alloc_file+0x78/0x170
  [32950.750433] [c00000012e413ce0] [c000000000841244] SyS_accept4+0xd4/0x280
  [32950.751833] [c00000012e413dc0] [c000000000842c50] SyS_socketcall+0x3c0/0x400
  [32950.753239] [c00000012e413e30] [c00000000000a0fc] syscall_exit+0x0/0x7c
  [32950.754577] Instruction dump:
  [32950.755324] 7f5fd378 e94d0040 e93f0000 7ce95214 e9070008 7fc9502a e9270010 2fbe0000 
  [32950.757463] 41de0070 2fa90000 419e0068 e93f0022 <7f7e482a> 39200000 88cd02ba 992d02ba 
  [32950.759787] ---[ end trace b20dd6fbb5b21933 ]---
  [32950.775708] 
  [32950.813539] systemd-logind[1071]: cgmanager: Error pinging manager: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
  [32950.821827] systemd-logind[1071]: Failed to create cpuset:/user/0.user/c2.session: No such file or directory
  [32950.825834] init: cgmanager main process (342) killed by SEGV signal
  [32950.828432] systemd-logind[1071]: Failed to create devices:/user/0.user/c2.session: No such file or directory
  [32950.832066] init: cgmanager main process ended, respawning
  [32950.833739] systemd-logind[1071]: Failed to create freezer:/user/0.user/c2.session: No such file or directory
  [32950.836759] systemd-logind[1071]: Failed to create hugetlb:/user/0.user/c2.session: No such file or directory
  [32950.839464] systemd-logind[1071]: Failed to create memory:/user/0.user/c2.session: No such file or directory
  [32950.842165] systemd-logind[1071]: Failed to create perf_event:/user/0.user/c2.session: No such file or directory
  [32950.844964] systemd-logind[1071]: Failed to create net_cls:/user/0.user/c2.session: No such file or directory
  [32950.847865] systemd-logind[1071]: Failed to create net_prio:/user/0.user/c2.session: No such file or directory
  [32950.862660] Unable to handle kernel paging request for data at address 0x2f0000000000000
  [32950.864661] Faulting instruction address: 0xc000000000260290
  [32950.866115] Oops: Kernel access of bad area, sig: 11 [#3]
  [32950.867269] SMP NR_CPUS=2048 NUMA pSeries
  [32950.868437] Modules linked in: pseries_rng rtc_generic ohci_pci
  [32950.870457] CPU: 8 PID: 3550 Comm: sh Tainted: G      D       3.16.0-6-generic #11-Ubuntu
  [32950.872181] task: c00000012d70e910 ti: c00000012d7a0000 task.ti: c00000012d7a0000
  [32950.873907] NIP: c000000000260290 LR: c000000000260384 CTR: c000000000409c60
  [32950.875637] REGS: c00000012d7a3780 TRAP: 0300   Tainted: G      D        (3.16.0-6-generic)
  [32950.877373] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22002888  XER: 20000000
  [32950.881408] CFAR: c00000000006aec8 DAR: 02f0000000000000 DSISR: 40000000 SOFTE: 1 
  GPR00: c000000000260384 c00000012d7a3a00 c000000001346618 0000000000000000 
  GPR04: 00000000000000d0 0000000000000001 c00000012d7a3b40 c000000001919db0 
  GPR08: 0000000000000e90 0000000000000000 0000000000b30000 c000000007c8a02d 
  GPR12: 0000000000002200 c00000000fe41c00 0000000000000000 0000000056fbd0b8 
  GPR16: 0000000056fbffa8 0000000056fbfeb8 0000010013380328 00003fffc4b6fed6 
  GPR20: 0000000000000000 0000010013380340 0000000056fbfe60 0000000000000004 
  GPR24: c00000012eeca100 c00000012d538300 c00000013604c600 0000000000000001 
  GPR28: c0000000002adc8c 00000000000000d0 02f0000000000000 c00000013604c600 
  [32950.904699] NIP [c000000000260290] kmem_cache_alloc+0x90/0x2d0
  [32950.905897] LR [c000000000260384] kmem_cache_alloc+0x184/0x2d0
  [32950.907011] Call Trace:
  [32950.907437] [c00000012d7a3a00] [c000000000260384] kmem_cache_alloc+0x184/0x2d0 (unreliable)
  [32950.909026] [c00000012d7a3a50] [c0000000002adc8c] __d_alloc+0x4c/0x1c0
  [32950.910130] [c00000012d7a3a90] [c0000000002ade38] d_alloc+0x38/0xd0
  [32950.911170] [c00000012d7a3ad0] [c00000000029a6fc] lookup_dcache+0x10c/0x140
  [32950.912159] [c00000012d7a3b20] [c00000000029a774] __lookup_hash+0x44/0x80
  [32950.913141] [c00000012d7a3b70] [c00000000029d610] lookup_slow+0x70/0x110
  [32950.914120] [c00000012d7a3bb0] [c00000000029ea08] path_lookupat+0x958/0x9a0
  [32950.915095] [c00000012d7a3c70] [c00000000029eaa8] filename_lookup+0x58/0x140
  [32950.916069] [c00000012d7a3cc0] [c0000000002a2524] user_path_at_empty+0x84/0xe0
  [32950.917208] [c00000012d7a3db0] [c000000000289908] SyS_faccessat+0xc8/0x2f0
  [32950.918183] [c00000012d7a3e30] [c00000000000a0fc] syscall_exit+0x0/0x7c
  [32950.919157] Instruction dump:
  [32950.919643] 7f5fd378 e94d0040 e93f0000 7ce95214 e9070008 7fc9502a e9270010 2fbe0000 
  [32950.921265] 41de0070 2fa90000 419e0068 e93f0022 <7f7e482a> 39200000 88cd02ba 992d02ba 
  [32950.923055] ---[ end trace b20dd6fbb5b21934 ]---
  [32950.939809] 
  [32950.940617] init: cgmanager main process (3550) killed by SEGV signal
  [32950.942078] init: cgmanager main process ended, respawning
  root@u10vm15:~# [32955.627058] Unable to handle kernel paging request for data at address 0x82cf8c206002008
  [32955.628340] Faulting instruction address: 0xc00000000035eca8
  [32955.629471] Oops: Kernel access of bad area, sig: 11 [#4]
  [32955.630336] SMP NR_CPUS=2048 NUMA pSeries
  [32955.631340] Modules linked in: pseries_rng rtc_generic ohci_pci
  [32955.633486] CPU: 1 PID: 217 Comm: jbd2/sda2-8 Tainted: G      D       3.16.0-6-generic #11-Ubuntu
  [32955.635062] task: c00000012d983f90 ti: c00000012da0c000 task.ti: c00000012da0c000
  [32955.636627] NIP: c00000000035eca8 LR: c00000000035ec84 CTR: c00000000035ec00
  [32955.638053] REGS: c00000012da0f7e0 TRAP: 0300   Tainted: G      D        (3.16.0-6-generic)
  [32955.639479] MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24022048  XER: 00000000
  [32955.642493] CFAR: c00000000013bd44 DAR: 082cf8c206002008 DSISR: 42000000 SOFTE: 1 
  GPR00: c00000000035ec84 c00000012da0fa60 c000000001346618 c00000012d4c5a88 
  GPR04: c000000005de0000 0000000001f3a8b9 0030fc0000000000 00001df91effac68 
  GPR08: 0000000000000ccd 082bb8eb06002000 082cf8c206002000 00001b76fc8cf2f7 
  GPR12: c00000000035ec00 c00000000fe40380 6db6db6db6db6db7 0000000000000000 
  GPR16: 0000000000000000 0000000000000001 0000000000080000 0000000000000020 
  GPR20: c00000012fd6e824 0000000000000040 0000000000000008 0000000000400000 
  GPR24: 0000000000000000 0000000000000000 0000000000000000 c00000012d4c4800 
  GPR28: c00000012d4c5a88 c0000000052b4cd0 c00000012d4c5800 c0000000052b4c00 
  [32955.654852] NIP [c00000000035eca8] ext4_journal_commit_callback+0xa8/0x170
  [32955.655489] LR [c00000000035ec84] ext4_journal_commit_callback+0x84/0x170
  [32955.656137] Call Trace:
  [32955.656382] [c00000012da0fa60] [c00000000035ec84] ext4_journal_commit_callback+0x84/0x170 (unreliable)
  [32955.657670] [c00000012da0fac0] [c00000000039bb5c] jbd2_journal_commit_transaction+0x171c/0x1ea0
  [32955.658712] [c00000012da0fcf0] [c0000000003a348c] kjournald2+0xec/0x300
  [32955.659516] [c00000012da0fd80] [c0000000000cbc30] kthread+0x110/0x130
  [32955.660317] [c00000012da0fe30] [c00000000000a3e8] ret_from_kernel_thread+0x5c/0x74
  [32955.661207] Instruction dump:
  [32955.661688] 7f83e378 3b000000 48658ac9 60000000 e89f00d0 3b400000 7fbd2040 419e0070 
  [32955.663050] 60000000 60420000 e9240008 e9440000 <f92a0008> f9490000 f8840000 f8840008 
  [32955.664260] ---[ end trace b20dd6fbb5b21935 ]---
  [32955.677577]

  == Comment: #1 - Kalpana Shetty <kalshett@xxxxxxxxxx> - 2014-08-05 23:54:12 ==
  Setup details:
  - MongoDB server on one VM (version: 2.6.2)
  - YCSB workload running on one VM (YCSB version - ycsb-0.1.4)

  
  [root@powerkvm5-lp1 ~]# uname -a
  Linux powerkvm5-lp1.austin.ibm.com 3.10.42-2004.pkvm2_1_1.8.ppc64 #1 SMP Fri Jul 18 11:20:03 CDT 2014 ppc64 ppc64 ppc64 GNU/Linux

  Guest OS: Ubuntu 14.10 
  [root@powerkvm5-lp1 ~]# virsh list --all
   Id    Name                           State
  ----------------------------------------------------
   3     kal_u10_ycsb                   running
   5     kal_u10_mongosrv               running

  uname on Guest OS:
  root@u10vm15:~# uname -a
  Linux u10vm15 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux

  root@u10vm15:~# lscpu
  Architecture:          ppc64le
  Byte Order:            Little Endian
  CPU(s):                16
  On-line CPU(s) list:   0-15
  Thread(s) per core:    8
  Core(s) per socket:    2
  Socket(s):             1
  NUMA node(s):          1
  Model:                 IBM pSeries (emulated by qemu)
  L1d cache:             64K
  L1i cache:             32K
  NUMA node0 CPU(s):     0-15

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1354024/+subscriptions