← Back to team overview

kernel-packages team mailing list archive

[Bug 1574814] Re: ThunderX: soft lockup in cursor_timer_handler() Edit

 

I captured a perf data file, starting just before the NMI kicked in - in
this case, it again occurred on CPU 11. I found that CPU 11 had spent a
lot of its time in cursor_timer_handler():

-   16.92%  swapper  [kernel.kallsyms]      [k] _raw_spin_unlock_irqrestore              ▒
   - _raw_spin_unlock_irqrestore                                                         ▒
      + 16.87% mod_timer                                                                 ▒
      + 0.05% cursor_timer_handler                                                       ▒
-   12.15%  swapper  [kernel.kallsyms]      [k] queue_work_on                            ▒
   - queue_work_on                                                                       ▒
      + 12.00% cursor_timer_handler                                                      ▒
      + 0.15% call_timer_fn                                                              ▒
+   10.98%  swapper  [kernel.kallsyms]      [k] run_timer_softirq                        ▒
-    2.23%  swapper  [kernel.kallsyms]      [k] mod_timer                                ▒
   - mod_timer                                                                           ▒
      + 1.97% cursor_timer_handler                                                       ▒
      + 0.26% call_timer_fn  

Looking at the profile of the other CPUs, I found one that was
interesting - CPU #12 - which appears to be the one actually running the
cursor update code CPU 11 is scheduling:

-   42.18%  kworker/u96:2  [kernel.kallsyms]  [k] ww_mutex_unlock                        ▒
   - ww_mutex_unlock                                                                     ▒
      - 40.70% ast_dirty_update                                                          ▒
           ast_imageblit                                                                 ▒
           soft_cursor                                                                   ▒
           bit_cursor                                                                    ▒
           fb_flashcursor                                                                ▒
           process_one_work                                                              ▒
           worker_thread                                                                 ▒
           kthread                                                                       ▒
           ret_from_fork                                                                 ▒
      + 1.48% ast_imageblit                                                              ▒
-   40.15%  kworker/u96:2  [kernel.kallsyms]  [k] __memcpy_toio                          ▒
   - __memcpy_toio                                                                       ▒
      + 31.54% ast_dirty_update                                                          ▒
      + 8.61% ast_imageblit  

I wonder if this path is blocking, preventing the timer handler on CPU
#11 from rescheduling. Indeed, when I time how long the handler takes
from start to finish, I am seeing occasional times around ~.1s.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1574814

Title:
  ThunderX: soft lockup in cursor_timer_handler() Edit

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I booted a Cavium ThunderX crb1s 2.0 system using the netboot mini iso via virtual media:
    http://ports.ubuntu.com/ubuntu-ports/dists/xenial/main/installer-arm64/20101020ubuntu451/images/netboot/mini.iso

  During boot I observed the following lockup on the serial console:

  [ 28.128327] usb 1-1.1: reset high-speed USB device number 3 using xhci_hcd
  [ 84.912299] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 23s! [swapper/14:0]
  [ 84.922718] Modules linked in: hid_generic(E) usbhid(E) hid(E) usb_storage(E) mdio_thunder(E) nicvf(E) ast(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) ttm(E) drm(E) nicpf(E) thunder_bgx(E) mdio_cavium(E)
  [ 84.922749]
  [ 84.922754] CPU: 14 PID: 0 Comm: swapper/14 Tainted: G E 4.4.0-21-generic #37-Ubuntu
  [ 84.922757] Hardware name: Cavium ThunderX CN88XX board (DT)
  [ 84.922761] task: ffff801f6c9d4100 ti: ffff801f6c9e8000 task.ti: ffff801f6c9e8000
  [ 84.922771] PC is at cursor_timer_handler+0x30/0x58
  [ 84.922775] LR is at cursor_timer_handler+0x30/0x58
  [ 84.922778] pc : [<ffff8000004ec4f0>] lr : [<ffff8000004ec4f0>] pstate: 00400145
  [ 84.922781] sp : ffff801f6c9ebc20
  [ 84.922784] x29: ffff801f6c9ebc20 x28: ffff8000f94398d8
  [ 84.922789] x27: ffff801f6c9ebd00 x26: ffff801f7b3bebb8
  [ 84.922793] x25: ffff801f6c9e8000 x24: ffff800000e5ec00
  [ 84.922798] x23: ffff801f667d9800 x22: ffff8000004ec4c0
  [ 84.922802] x21: 0000000000000100 x20: ffff8000f94398d8
  [ 84.922807] x19: ffff8000f9439800 x18: 0000ffffc76a5358
  [ 84.922811] x17: 0000ffff97bbd2a8 x16: ffff8000002a5040
  [ 84.922816] x15: 000000003e4cf1e0 x14: 0000000000000008
  [ 84.922820] x13: 0000000000000000 x12: 003d090000000000
  [ 84.922824] x11: 00000000003d0900 x10: ffff80000090f200
  [ 84.922829] x9 : 00003d0900000000 x8 : 000000000000000e
  [ 84.922833] x7 : ffff801f7b3c5008 x6 : 00000000ffffffff
  [ 84.922837] x5 : 0000000000000000 x4 : 0000000000000001
  [ 84.922842] x3 : 0000000000000000 x2 : ffff801f6c899e05
  [ 84.922846] x1 : ffff801f667d99e0 x0 : 0000000000000000
  [ 84.922850]
  [ 101.008387] usb 1-1.1: reset high-speed USB device number 3 using xhci_hcd
  [ 101.180375] usb 1-1.1: reset high-speed USB device number 3 using xhci_hcd
  [ 101.342677] random: nonblocking pool is initialized

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1574814/+subscriptions


References