← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Bug 2026722] Re: cpuset_hotplug in ubuntu_ltp_controllers hang with X-aws VM instances

 

MR to fail this directly on this ESM kernel:
https://code.launchpad.net/~canonical-kernel-team/+git/autotest-client-tests/+merge/446583

-- 
You received this bug notification because you are a member of Canonical
Platform QA Team, which is subscribed to ubuntu-kernel-tests.
https://bugs.launchpad.net/bugs/2026722

Title:
  cpuset_hotplug in ubuntu_ltp_controllers hang with X-aws VM instances

Status in ubuntu-kernel-tests:
  New

Bug description:
  Issue found on X-aws 4.4.0-1158.173

  The cpuset_hotplug test will timeout with the 45 minutes on VMs
  (passed on AWS bare-metals) because of task hang.

  Test output:
  <<<test_start>>>
  tag=cpuset_hotplug stime=1688982088
  cmdline="cpuset_hotplug_test.sh"
  contacts=""
  analysis=exit
  <<<test_output>>>
  cpuset_hotplug 1 TINFO: CPUs are numbered continuously starting at 0 (0-1)
  cpuset_hotplug 1 TINFO: Nodes are numbered continuously starting at 0 (0)
  cpuset_hotplug 1 TPASS: Cpuset vs CPU hotplug test succeeded.
  cpuset_hotplug 3 TPASS: Cpuset vs CPU hotplug test succeeded.
  cpuset_hotplug 5 TPASS: Cpuset vs CPU hotplug test succeeded.

  This issue can be reproduced with X-aws-1128, dmesg output:
  [ 1083.816153] LTP: starting cpuset_hotplug (cpuset_hotplug_test.sh)
  [ 1083.948273] Broke affinity for irq 4
  [ 1083.948396] Broke affinity for irq 61
  [ 1083.948402] Broke affinity for irq 62
  [ 1083.948408] Broke affinity for irq 63
  [ 1083.948413] Broke affinity for irq 64
  [ 1083.949824] smpboot: CPU 1 is now offline
  [ 1085.000502] installing Xen timer for CPU 1
  [ 1085.020261] x86: Booting SMP configuration:
  [ 1085.020263] smpboot: Booting Node 0 Processor 1 APIC 0x1
  [ 1085.021673] Skipped synchronization checks as TSC is reliable.
  [ 1085.021693] cpu 1 spinlock event irq 59
  [ 1085.069370] smpboot: CPU 1 is now offline
  [ 1085.081932] installing Xen timer for CPU 1
  [ 1085.100218] x86: Booting SMP configuration:
  [ 1085.100220] smpboot: Booting Node 0 Processor 1 APIC 0x1
  [ 1085.101588] Skipped synchronization checks as TSC is reliable.
  [ 1085.101608] cpu 1 spinlock event irq 59
  [ 1086.405418] smpboot: CPU 1 is now offline
  [ 1087.467625] installing Xen timer for CPU 1
  [ 1087.484259] x86: Booting SMP configuration:
  [ 1087.484261] smpboot: Booting Node 0 Processor 1 APIC 0x1
  [ 1087.485700] Skipped synchronization checks as TSC is reliable.
  [ 1087.485721] cpu 1 spinlock event irq 59
  [ 1320.308124] INFO: task systemd:1 blocked for more than 120 seconds.
  [ 1320.311249]       Not tainted 4.4.0-1128-aws #142-Ubuntu
  [ 1320.313959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [ 1320.318032] systemd         D ffff8800ea4f7d28     0     1      0 0x00000000
  [ 1320.318037]  ffff8800ea4f7d28 ffff8800372adb40 ffffffff81e13500 ffff8800ea4f8000
  [ 1320.318042]  ffff8800ea4f8000 ffffffff81e605e4 ffff8800ea4f8000 00000000ffffffff
  [ 1320.318045]  ffffffff81e605e8 ffff8800ea4f7d40 ffffffff81844555 ffffffff81e605e0
  [ 1320.318048] Call Trace:
  [ 1320.318057]  [<ffffffff81844555>] schedule+0x35/0x80
  [ 1320.318061]  [<ffffffff818448ae>] schedule_preempt_disabled+0xe/0x10
  [ 1320.318065]  [<ffffffff81846767>] __mutex_lock_slowpath+0xb7/0x130
  [ 1320.318068]  [<ffffffff818467ff>] mutex_lock+0x1f/0x30
  [ 1320.318074]  [<ffffffff8111f1bc>] proc_cgroup_show+0x4c/0x2e0
  [ 1320.318078]  [<ffffffff81289ac5>] proc_single_show+0x55/0x90
  [ 1320.318083]  [<ffffffff8123e662>] seq_read+0xf2/0x3d0
  [ 1320.318088]  [<ffffffff81354d86>] ? security_file_permission+0xa6/0xc0
  [ 1320.318093]  [<ffffffff8121820b>] __vfs_read+0x1b/0x40
  [ 1320.318096]  [<ffffffff81218986>] vfs_read+0x86/0x130
  [ 1320.318100]  [<ffffffff812196dc>] SyS_read+0x5c/0xe0
  [ 1320.318104]  [<ffffffff8184905b>] entry_SYSCALL_64_fastpath+0x22/0xd0
  [ 1320.318118] INFO: task kworker/0:3:473 blocked for more than 120 seconds.
  [ 1320.321669]       Not tainted 4.4.0-1128-aws #142-Ubuntu
  [ 1320.324364] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [ 1320.328026] kworker/0:3     D ffff8800364cfc68     0   473      2 0x00000000
  [ 1320.328031] Workqueue: cgroup_destroy css_killed_work_fn
  [ 1320.328033]  ffff8800364cfc68 ffff8800d3ff4cc0 ffff880036d94600 ffff880036f9f000
  [ 1320.328035]  ffff8800364d0000 ffffffff81e42fa4 ffff880036f9f000 00000000ffffffff
  [ 1320.328037]  ffffffff81e42fa8 ffff8800364cfc80 ffffffff81844555 ffffffff81e42fa0
  [ 1320.328039] Call Trace:
  [ 1320.328043]  [<ffffffff81844555>] schedule+0x35/0x80
  [ 1320.328045]  [<ffffffff818448ae>] schedule_preempt_disabled+0xe/0x10
  [ 1320.328047]  [<ffffffff81846767>] __mutex_lock_slowpath+0xb7/0x130
  [ 1320.328052]  [<ffffffff818467ff>] mutex_lock+0x1f/0x30
  [ 1320.328072]  [<ffffffff8108474c>] get_online_cpus+0x2c/0x50
  [ 1320.328078]  [<ffffffff81033ace>] arch_jump_label_transform+0x2e/0x120
  [ 1320.328081]  [<ffffffff81844011>] ? __schedule+0x301/0x810
  [ 1320.328085]  [<ffffffff81193a96>] __jump_label_update+0x76/0x90
  [ 1320.328088]  [<ffffffff81193b38>] jump_label_update+0x88/0x90
  [ 1320.328092]  [<ffffffff81193de1>] __static_key_slow_dec+0x41/0x90
  [ 1320.328095]  [<ffffffff81193e52>] static_key_slow_dec+0x22/0x60
  [ 1320.328098]  [<ffffffff8112263f>] cpuset_css_offline+0x3f/0x70
  [ 1320.328102]  [<ffffffff811195ac>] css_killed_work_fn+0x3c/0xd0
  [ 1320.328106]  [<ffffffff8109ec7b>] process_one_work+0x16b/0x4e0
  [ 1320.328109]  [<ffffffff8109f03e>] worker_thread+0x4e/0x580
  [ 1320.328116]  [<ffffffff8109eff0>] ? process_one_work+0x4e0/0x4e0
  [ 1320.328119]  [<ffffffff810a55e7>] kthread+0xe7/0x100
  [ 1320.328122]  [<ffffffff81844011>] ? __schedule+0x301/0x810
  [ 1320.328128]  [<ffffffff810a5500>] ? kthread_create_on_node+0x1e0/0x1e0
  [ 1320.328131]  [<ffffffff818494e5>] ret_from_fork+0x55/0x80
  [ 1320.328134]  [<ffffffff810a5500>] ? kthread_create_on_node+0x1e0/0x1e0
  [ 1320.328142] INFO: task irqbalance:1189 blocked for more than 120 seconds.
  [ 1320.331605]       Not tainted 4.4.0-1128-aws #142-Ubuntu
  [ 1320.334291] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [ 1320.338224] irqbalance      D ffff8800e8b07ce8     0  1189      1 0x00000000
  [ 1320.338228]  ffff8800e8b07ce8 ffff8800e8b07dd0 ffffffff81e13500 ffff8800369dc600
  [ 1320.338232]  ffff8800e8b08000 ffff8800eb64f7cc ffff8800369dc600 00000000ffffffff
  [ 1320.338235]  ffff8800eb64f7d0 ffff8800e8b07d00 ffffffff81844555 ffff8800eb64f7c8
  [ 1320.338238] Call Trace:
  [ 1320.338242]  [<ffffffff81844555>] schedule+0x35/0x80
  [ 1320.338249]  [<ffffffff818448ae>] schedule_preempt_disabled+0xe/0x10
  [ 1320.338256]  [<ffffffff81846767>] __mutex_lock_slowpath+0xb7/0x130
  [ 1320.338259]  [<ffffffff818467ff>] mutex_lock+0x1f/0x30
  [ 1320.338264]  [<ffffffff81554500>] online_show+0x20/0x60
  [ 1320.338269]  [<ffffffff81554a63>] dev_attr_show+0x23/0x60
  [ 1320.338272]  [<ffffffff818467f2>] ? mutex_lock+0x12/0x30
  [ 1320.338279]  [<ffffffff8129aeaf>] sysfs_kf_seq_show+0xbf/0x140
  [ 1320.338283]  [<ffffffff812993a9>] kernfs_seq_show+0x29/0x30
  [ 1320.338285]  [<ffffffff8123e662>] seq_read+0xf2/0x3d0
  [ 1320.338289]  [<ffffffff81299b5d>] kernfs_fop_read+0x10d/0x160
  [ 1320.338292]  [<ffffffff81354d86>] ? security_file_permission+0xa6/0xc0
  [ 1320.338296]  [<ffffffff8121820b>] __vfs_read+0x1b/0x40
  [ 1320.338299]  [<ffffffff81218986>] vfs_read+0x86/0x130
  [ 1320.338303]  [<ffffffff812196dc>] SyS_read+0x5c/0xe0
  [ 1320.338306]  [<ffffffff8184905b>] entry_SYSCALL_64_fastpath+0x22/0xd0
  [ 1320.338317] INFO: task kworker/0:2:1448 blocked for more than 120 seconds.
  [ 1320.341877]       Not tainted 4.4.0-1128-aws #142-Ubuntu
  [ 1320.344538] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [ 1320.348296] kworker/0:2     D ffff8800bf097d38     0  1448      2 0x00000000
  [ 1320.348305] Workqueue: events vmstat_shepherd
  [ 1320.348307]  ffff8800bf097d38 ffffffff8141c925 ffffffff81e13500 ffff88008aa32a00
  [ 1320.348310]  ffff8800bf098000 ffffffff81e42fa4 ffff88008aa32a00 00000000ffffffff
  [ 1320.348313]  ffffffff81e42fa8 ffff8800bf097d50 ffffffff81844555 ffffffff81e42fa0
  [ 1320.348316] Call Trace:
  [ 1320.348321]  [<ffffffff8141c925>] ? find_next_bit+0x15/0x20
  [ 1320.348329]  [<ffffffff81844555>] schedule+0x35/0x80
  [ 1320.348332]  [<ffffffff818448ae>] schedule_preempt_disabled+0xe/0x10
  [ 1320.348336]  [<ffffffff81846767>] __mutex_lock_slowpath+0xb7/0x130
  [ 1320.348342]  [<ffffffff818467ff>] mutex_lock+0x1f/0x30
  [ 1320.348345]  [<ffffffff8108474c>] get_online_cpus+0x2c/0x50
  [ 1320.348350]  [<ffffffff811b65c1>] vmstat_shepherd+0x21/0xd0
  [ 1320.348354]  [<ffffffff8109ec7b>] process_one_work+0x16b/0x4e0
  [ 1320.348362]  [<ffffffff8109f03e>] worker_thread+0x4e/0x580
  [ 1320.348366]  [<ffffffff8109eff0>] ? process_one_work+0x4e0/0x4e0
  [ 1320.348369]  [<ffffffff810a55e7>] kthread+0xe7/0x100
  [ 1320.348372]  [<ffffffff81844011>] ? __schedule+0x301/0x810
  [ 1320.348375]  [<ffffffff810a5500>] ? kthread_create_on_node+0x1e0/0x1e0
  [ 1320.348378]  [<ffffffff818494e5>] ret_from_fork+0x55/0x80
  [ 1320.348381]  [<ffffffff810a5500>] ? kthread_create_on_node+0x1e0/0x1e0
  [ 1320.348386] INFO: task echo:1878 blocked for more than 120 seconds.
  [ 1320.351463]       Not tainted 4.4.0-1128-aws #142-Ubuntu
  [ 1320.354182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [ 1320.358193] echo            D ffff88008aaf3c28     0  1878   1761 0x00000000
  [ 1320.358197]  ffff88008aaf3c28 000000000000001b ffffffff81e13500 ffff880036a83800
  [ 1320.358201]  ffff88008aaf4000 ffffffff81e6aec4 ffff880036a83800 00000000ffffffff
  [ 1320.358204]  ffffffff81e6aec8 ffff88008aaf3c40 ffffffff81844555 ffffffff81e6aec0
  [ 1320.358207] Call Trace:
  [ 1320.358212]  [<ffffffff81844555>] schedule+0x35/0x80
  [ 1320.358218]  [<ffffffff818448ae>] schedule_preempt_disabled+0xe/0x10
  [ 1320.358226]  [<ffffffff81846767>] __mutex_lock_slowpath+0xb7/0x130
  [ 1320.358230]  [<ffffffff810ddf79>] ? vprintk_default+0x29/0x40
  [ 1320.358233]  [<ffffffff818467ff>] mutex_lock+0x1f/0x30
  [ 1320.358239]  [<ffffffff81193d70>] static_key_slow_inc+0x70/0xa0
  [ 1320.358243]  [<ffffffff810ab19d>] sched_cpu_active+0xdd/0xf0
  [ 1320.358247]  [<ffffffff810a659c>] notifier_call_chain+0x4c/0x70
  [ 1320.358251]  [<ffffffff810a66fe>] __raw_notifier_call_chain+0xe/0x10
  [ 1320.358258]  [<ffffffff81084913>] cpu_notify+0x23/0x40
  [ 1320.358262]  [<ffffffff810850c3>] _cpu_up+0x183/0x1b0
  [ 1320.358265]  [<ffffffff810851a2>] cpu_up+0xb2/0xd0
  [ 1320.358268]  [<ffffffff8155c5ed>] cpu_subsys_online+0x3d/0x90
  [ 1320.358271]  [<ffffffff81556f12>] device_online+0x62/0x90
  [ 1320.358274]  [<ffffffff81556fc9>] online_store+0x89/0xa0
  [ 1320.358277]  [<ffffffff81553e8b>] dev_attr_store+0x1b/0x30
  [ 1320.358281]  [<ffffffff8129a89c>] sysfs_kf_write+0x3c/0x50
  [ 1320.358284]  [<ffffffff8129a403>] kernfs_fop_write+0x143/0x180
  [ 1320.358288]  [<ffffffff8121833b>] __vfs_write+0x1b/0x40
  [ 1320.358291]  [<ffffffff81218ad9>] vfs_write+0xa9/0x1a0
  [ 1320.358295]  [<ffffffff812197bc>] SyS_write+0x5c/0xe0
  [ 1320.358297]  [<ffffffff8184905b>] entry_SYSCALL_64_fastpath+0x22/0xd0

  This failure will cause the following cpuset_memory to fail. And leads
  to test interruption.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2026722/+subscriptions



References