← Back to team overview

kernel-packages team mailing list archive

[Bug 1467955] Re: Precise BUG: soft lockup in flush_tlb_others_ipi

 

** Description changed:

- The following stack trace (with kernel dump) was brought to me:
+ The following stack trace (with kernel dump) was brought to me. It looks
+ like this crash is happening every day (at least once) in a KVM + CEPH
+ backend environment.
  
  """
- [1796904.032010] BUG: soft lockup - CPU#0 stuck for 23s! [java:6383] 
- [1796904.036004] Modules linked in: isofs psmouse virtio_balloon serio_raw acpiphp floppy 
- [1796904.036004] CPU 0 
- [1796904.036004] Modules linked in: isofs psmouse virtio_balloon serio_raw acpiphp floppy 
- [1796904.036004] 
- [1796904.036004] Pid: 6383, comm: java Not tainted 3.2.0-76-virtual #111-Ubuntu OpenStack Foundation OpenStack Nova 
- [1796904.036004] RIP: 0010:[<ffffffff81046922>] [<ffffffff81046922>] flush_tlb_others_ipi+0x122/0x130 
- [1796904.036004] RSP: 0018:ffff880065791d58 EFLAGS: 00000202 
- [1796904.036004] RAX: 0000000000000002 RBX: ffffea0003470bf0 RCX: 0000000000000002 
- [1796904.036004] RDX: 0000000000000002 RSI: 0000000000000040 RDI: 0000000000000296 
- [1796904.036004] RBP: ffff880065791d88 R08: ffffffff81e0c0a0 R09: 0000000000000040 
- [1796904.036004] R10: ffffea0003471240 R11: 0000000000000000 R12: ffff880065791e20 
- [1796904.036004] R13: ffff880059e96f20 R14: ffff880116249848 R15: 00ff880065791d78 
- [1796904.036004] FS: 00007f83612d2700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 
- [1796904.036004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
- [1796904.036004] CR2: 00007f83be381420 CR3: 0000000118be0000 CR4: 00000000000006f0 
- [1796904.036004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
- [1796904.036004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 
- [1796917.981999] Process java (pid: 6383, threadinfo ffff880065790000, task ffff880053c0dbc0) 
- [1796917.981999] Stack: 
- [1796917.981999] 00007f83612ccfff ffff880059e96f20 ffff880116200e00 ffff8801162010d0 
- [1796917.981999] 00007f83612cd000 ffff880116200e00 ffff880065791d98 ffffffff81046aae 
- [1796917.981999] ffff880065791db8 ffffffff81046b7b 00007f83611d5000 ffff880065791e20 
- [1796917.981999] Call Trace: 
- [1796917.982394] ata2: lost interrupt (Status 0x58) 
- [1796917.981999] [<ffffffff81046aae>] native_flush_tlb_others+0xe/0x10 
- [1796917.981999] [<ffffffff81046b7b>] flush_tlb_mm+0x5b/0xa0 
- [1796917.981999] [<ffffffff8113ba06>] tlb_flush_mmu+0x46/0x90 
- [1796917.981999] [<ffffffff8113ba64>] tlb_finish_mmu+0x14/0x40 
- [1796917.981999] [<ffffffff8113e3a7>] zap_page_range+0xb7/0xd0 
- [1796917.981999] [<ffffffff8113a85d>] madvise_vma+0xfd/0x140 
- [1796917.981999] [<ffffffff8107b917>] ? __set_task_blocked+0x37/0x80 
- [1796917.981999] [<ffffffff81095b27>] ? getnstimeofday+0x57/0xe0 
- [1796917.981999] [<ffffffff8113aa7e>] sys_madvise+0x1de/0x280 
- [1796917.981999] [<ffffffff81666b82>] system_call_fastpath+0x16/0x1b 
- [1796917.981999] Code: 41 8d b6 cf 00 00 00 49 8d 7d 18 ff 90 d0 00 00 00 49 83 bc 24 98 c0 e0 81 00 0f 84 74 ff ff ff 66 0f 1f 84 00 00 00 00 00 f3 90 <49> 83 7d 18 00 75 f7 e9 5d ff ff ff 66 90 55 48 89 e5 66 66 66 
- [1796917.981999] Call Trace: 
- [1796917.981999] [<ffffffff81046aae>] native_flush_tlb_others+0xe/0x10 
- [1796917.981999] [<ffffffff81046b7b>] flush_tlb_mm+0x5b/0xa0 
- [1796917.981999] [<ffffffff8113ba06>] tlb_flush_mmu+0x46/0x90 
- [1796917.981999] [<ffffffff8113ba64>] tlb_finish_mmu+0x14/0x40 
- [1796917.981999] [<ffffffff8113e3a7>] zap_page_range+0xb7/0xd0 
- [1796917.981999] [<ffffffff8113a85d>] madvise_vma+0xfd/0x140 
- [1796917.981999] [<ffffffff8107b917>] ? __set_task_blocked+0x37/0x80 
- [1796917.981999] [<ffffffff81095b27>] ? getnstimeofday+0x57/0xe0 
- [1796917.981999] [<ffffffff8113aa7e>] sys_madvise+0x1de/0x280 
- [1796917.981999] [<ffffffff81666b82>] system_call_fastpath+0x16/0x1b 
+ [1796904.032010] BUG: soft lockup - CPU#0 stuck for 23s! [java:6383]
+ [1796904.036004] Modules linked in: isofs psmouse virtio_balloon serio_raw acpiphp floppy
+ [1796904.036004] CPU 0
+ [1796904.036004] Modules linked in: isofs psmouse virtio_balloon serio_raw acpiphp floppy
+ [1796904.036004]
+ [1796904.036004] Pid: 6383, comm: java Not tainted 3.2.0-76-virtual #111-Ubuntu OpenStack Foundation OpenStack Nova
+ [1796904.036004] RIP: 0010:[<ffffffff81046922>] [<ffffffff81046922>] flush_tlb_others_ipi+0x122/0x130
+ [1796904.036004] RSP: 0018:ffff880065791d58 EFLAGS: 00000202
+ [1796904.036004] RAX: 0000000000000002 RBX: ffffea0003470bf0 RCX: 0000000000000002
+ [1796904.036004] RDX: 0000000000000002 RSI: 0000000000000040 RDI: 0000000000000296
+ [1796904.036004] RBP: ffff880065791d88 R08: ffffffff81e0c0a0 R09: 0000000000000040
+ [1796904.036004] R10: ffffea0003471240 R11: 0000000000000000 R12: ffff880065791e20
+ [1796904.036004] R13: ffff880059e96f20 R14: ffff880116249848 R15: 00ff880065791d78
+ [1796904.036004] FS: 00007f83612d2700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
+ [1796904.036004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+ [1796904.036004] CR2: 00007f83be381420 CR3: 0000000118be0000 CR4: 00000000000006f0
+ [1796904.036004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
+ [1796904.036004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
+ [1796917.981999] Process java (pid: 6383, threadinfo ffff880065790000, task ffff880053c0dbc0)
+ [1796917.981999] Stack:
+ [1796917.981999] 00007f83612ccfff ffff880059e96f20 ffff880116200e00 ffff8801162010d0
+ [1796917.981999] 00007f83612cd000 ffff880116200e00 ffff880065791d98 ffffffff81046aae
+ [1796917.981999] ffff880065791db8 ffffffff81046b7b 00007f83611d5000 ffff880065791e20
+ [1796917.981999] Call Trace:
+ [1796917.982394] ata2: lost interrupt (Status 0x58)
+ [1796917.981999] [<ffffffff81046aae>] native_flush_tlb_others+0xe/0x10
+ [1796917.981999] [<ffffffff81046b7b>] flush_tlb_mm+0x5b/0xa0
+ [1796917.981999] [<ffffffff8113ba06>] tlb_flush_mmu+0x46/0x90
+ [1796917.981999] [<ffffffff8113ba64>] tlb_finish_mmu+0x14/0x40
+ [1796917.981999] [<ffffffff8113e3a7>] zap_page_range+0xb7/0xd0
+ [1796917.981999] [<ffffffff8113a85d>] madvise_vma+0xfd/0x140
+ [1796917.981999] [<ffffffff8107b917>] ? __set_task_blocked+0x37/0x80
+ [1796917.981999] [<ffffffff81095b27>] ? getnstimeofday+0x57/0xe0
+ [1796917.981999] [<ffffffff8113aa7e>] sys_madvise+0x1de/0x280
+ [1796917.981999] [<ffffffff81666b82>] system_call_fastpath+0x16/0x1b
+ [1796917.981999] Code: 41 8d b6 cf 00 00 00 49 8d 7d 18 ff 90 d0 00 00 00 49 83 bc 24 98 c0 e0 81 00 0f 84 74 ff ff ff 66 0f 1f 84 00 00 00 00 00 f3 90 <49> 83 7d 18 00 75 f7 e9 5d ff ff ff 66 90 55 48 89 e5 66 66 66
+ [1796917.981999] Call Trace:
+ [1796917.981999] [<ffffffff81046aae>] native_flush_tlb_others+0xe/0x10
+ [1796917.981999] [<ffffffff81046b7b>] flush_tlb_mm+0x5b/0xa0
+ [1796917.981999] [<ffffffff8113ba06>] tlb_flush_mmu+0x46/0x90
+ [1796917.981999] [<ffffffff8113ba64>] tlb_finish_mmu+0x14/0x40
+ [1796917.981999] [<ffffffff8113e3a7>] zap_page_range+0xb7/0xd0
+ [1796917.981999] [<ffffffff8113a85d>] madvise_vma+0xfd/0x140
+ [1796917.981999] [<ffffffff8107b917>] ? __set_task_blocked+0x37/0x80
+ [1796917.981999] [<ffffffff81095b27>] ? getnstimeofday+0x57/0xe0
+ [1796917.981999] [<ffffffff8113aa7e>] sys_madvise+0x1de/0x280
+ [1796917.981999] [<ffffffff81666b82>] system_call_fastpath+0x16/0x1b
  [1796917.992066] ata2: drained 65536 bytes to clear DRQ
  """
  
  Analysis Bellow...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1467955

Title:
  Precise BUG: soft lockup in flush_tlb_others_ipi

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Precise:
  In Progress

Bug description:
  The following stack trace (with kernel dump) was brought to me. It
  looks like this crash is happening every day (at least once) in a KVM
  + CEPH backend environment.

  """
  [1796904.032010] BUG: soft lockup - CPU#0 stuck for 23s! [java:6383]
  [1796904.036004] Modules linked in: isofs psmouse virtio_balloon serio_raw acpiphp floppy
  [1796904.036004] CPU 0
  [1796904.036004] Modules linked in: isofs psmouse virtio_balloon serio_raw acpiphp floppy
  [1796904.036004]
  [1796904.036004] Pid: 6383, comm: java Not tainted 3.2.0-76-virtual #111-Ubuntu OpenStack Foundation OpenStack Nova
  [1796904.036004] RIP: 0010:[<ffffffff81046922>] [<ffffffff81046922>] flush_tlb_others_ipi+0x122/0x130
  [1796904.036004] RSP: 0018:ffff880065791d58 EFLAGS: 00000202
  [1796904.036004] RAX: 0000000000000002 RBX: ffffea0003470bf0 RCX: 0000000000000002
  [1796904.036004] RDX: 0000000000000002 RSI: 0000000000000040 RDI: 0000000000000296
  [1796904.036004] RBP: ffff880065791d88 R08: ffffffff81e0c0a0 R09: 0000000000000040
  [1796904.036004] R10: ffffea0003471240 R11: 0000000000000000 R12: ffff880065791e20
  [1796904.036004] R13: ffff880059e96f20 R14: ffff880116249848 R15: 00ff880065791d78
  [1796904.036004] FS: 00007f83612d2700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
  [1796904.036004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [1796904.036004] CR2: 00007f83be381420 CR3: 0000000118be0000 CR4: 00000000000006f0
  [1796904.036004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [1796904.036004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  [1796917.981999] Process java (pid: 6383, threadinfo ffff880065790000, task ffff880053c0dbc0)
  [1796917.981999] Stack:
  [1796917.981999] 00007f83612ccfff ffff880059e96f20 ffff880116200e00 ffff8801162010d0
  [1796917.981999] 00007f83612cd000 ffff880116200e00 ffff880065791d98 ffffffff81046aae
  [1796917.981999] ffff880065791db8 ffffffff81046b7b 00007f83611d5000 ffff880065791e20
  [1796917.981999] Call Trace:
  [1796917.982394] ata2: lost interrupt (Status 0x58)
  [1796917.981999] [<ffffffff81046aae>] native_flush_tlb_others+0xe/0x10
  [1796917.981999] [<ffffffff81046b7b>] flush_tlb_mm+0x5b/0xa0
  [1796917.981999] [<ffffffff8113ba06>] tlb_flush_mmu+0x46/0x90
  [1796917.981999] [<ffffffff8113ba64>] tlb_finish_mmu+0x14/0x40
  [1796917.981999] [<ffffffff8113e3a7>] zap_page_range+0xb7/0xd0
  [1796917.981999] [<ffffffff8113a85d>] madvise_vma+0xfd/0x140
  [1796917.981999] [<ffffffff8107b917>] ? __set_task_blocked+0x37/0x80
  [1796917.981999] [<ffffffff81095b27>] ? getnstimeofday+0x57/0xe0
  [1796917.981999] [<ffffffff8113aa7e>] sys_madvise+0x1de/0x280
  [1796917.981999] [<ffffffff81666b82>] system_call_fastpath+0x16/0x1b
  [1796917.981999] Code: 41 8d b6 cf 00 00 00 49 8d 7d 18 ff 90 d0 00 00 00 49 83 bc 24 98 c0 e0 81 00 0f 84 74 ff ff ff 66 0f 1f 84 00 00 00 00 00 f3 90 <49> 83 7d 18 00 75 f7 e9 5d ff ff ff 66 90 55 48 89 e5 66 66 66
  [1796917.981999] Call Trace:
  [1796917.981999] [<ffffffff81046aae>] native_flush_tlb_others+0xe/0x10
  [1796917.981999] [<ffffffff81046b7b>] flush_tlb_mm+0x5b/0xa0
  [1796917.981999] [<ffffffff8113ba06>] tlb_flush_mmu+0x46/0x90
  [1796917.981999] [<ffffffff8113ba64>] tlb_finish_mmu+0x14/0x40
  [1796917.981999] [<ffffffff8113e3a7>] zap_page_range+0xb7/0xd0
  [1796917.981999] [<ffffffff8113a85d>] madvise_vma+0xfd/0x140
  [1796917.981999] [<ffffffff8107b917>] ? __set_task_blocked+0x37/0x80
  [1796917.981999] [<ffffffff81095b27>] ? getnstimeofday+0x57/0xe0
  [1796917.981999] [<ffffffff8113aa7e>] sys_madvise+0x1de/0x280
  [1796917.981999] [<ffffffff81666b82>] system_call_fastpath+0x16/0x1b
  [1796917.992066] ata2: drained 65536 bytes to clear DRQ
  """

  Analysis Bellow...

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1467955/+subscriptions


References