← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Bug 2044334] [NEW] Unable to put CPU back online on AWS x1e.xlarge instance with kernel 6.2+

 

Public bug reported:

Issue found on AWS x1e.xlarge instance with:
* M-aws 6.5.0-1011.11
* L-aws 6.2.0-1007.7 
* J-aws-6.5.0-1008.8~22.04.1
* J-aws-6.2.0-1005.5~22.04.1 

J-aws-5.15 looks OK. And I can't see this failure on other instances in
our pool.

CPU can be offlined but you won't be able to put it back online.

There are 4 CPUs on this instance.
$ uname -a
Linux ip-172-31-2-102 6.5.0-1011-aws #11~22.04.1-Ubuntu SMP Mon Nov 20 18:38:58 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ grep CONFIG_HOTPLUG_CPU /boot/config-6.5.0-1011-aws 
CONFIG_HOTPLUG_CPU=y
$ cat /sys/devices/system/cpu/cpu3/online
1
$ echo 0| sudo tee /sys/devices/system/cpu/cpu3/online
0
$ echo 1| sudo tee /sys/devices/system/cpu/cpu3/online
1
tee: /sys/devices/system/cpu/cpu3/online: Input/output error

Output from 
# Offline cpu3 - OK
Nov 23 06:21:06 ip-172-31-2-102 kernel: [ 1124.449748] smpboot: CPU 3 is now offline
# Online cpu3 - Failed
Nov 23 06:21:14 ip-172-31-2-102 kernel: [ 1132.310197] installing Xen timer for CPU 3
Nov 23 06:21:14 ip-172-31-2-102 kernel: [ 1132.310424] smpboot: Booting Node 0 Processor 3 APIC 0x3
Nov 23 06:21:24 ip-172-31-2-102 kernel: [ 1142.312481] CPU3 failed to report alive state

This is affecting the ubuntu_kernel_selftests/cpu-hotplug:cpu-on-off-
test.sh and ubuntu_ltp/cpuhotplug:cpuhotplug02, cpuhotplug03,
cpuhotplug04, cpuhotplug06

** Affects: ubuntu-kernel-tests
     Importance: Undecided
         Status: New


** Tags: 6.2 6.5 aws ubuntu-kernel-selftests ubuntu-ltp

-- 
You received this bug notification because you are a member of Canonical
Platform QA Team, which is subscribed to ubuntu-kernel-tests.
https://bugs.launchpad.net/bugs/2044334

Title:
  Unable to put CPU back online on AWS x1e.xlarge instance with kernel
  6.2+

Status in ubuntu-kernel-tests:
  New

Bug description:
  Issue found on AWS x1e.xlarge instance with:
  * M-aws 6.5.0-1011.11
  * L-aws 6.2.0-1007.7 
  * J-aws-6.5.0-1008.8~22.04.1
  * J-aws-6.2.0-1005.5~22.04.1 

  J-aws-5.15 looks OK. And I can't see this failure on other instances
  in our pool.

  CPU can be offlined but you won't be able to put it back online.

  There are 4 CPUs on this instance.
  $ uname -a
  Linux ip-172-31-2-102 6.5.0-1011-aws #11~22.04.1-Ubuntu SMP Mon Nov 20 18:38:58 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  $ grep CONFIG_HOTPLUG_CPU /boot/config-6.5.0-1011-aws 
  CONFIG_HOTPLUG_CPU=y
  $ cat /sys/devices/system/cpu/cpu3/online
  1
  $ echo 0| sudo tee /sys/devices/system/cpu/cpu3/online
  0
  $ echo 1| sudo tee /sys/devices/system/cpu/cpu3/online
  1
  tee: /sys/devices/system/cpu/cpu3/online: Input/output error

  Output from 
  # Offline cpu3 - OK
  Nov 23 06:21:06 ip-172-31-2-102 kernel: [ 1124.449748] smpboot: CPU 3 is now offline
  # Online cpu3 - Failed
  Nov 23 06:21:14 ip-172-31-2-102 kernel: [ 1132.310197] installing Xen timer for CPU 3
  Nov 23 06:21:14 ip-172-31-2-102 kernel: [ 1132.310424] smpboot: Booting Node 0 Processor 3 APIC 0x3
  Nov 23 06:21:24 ip-172-31-2-102 kernel: [ 1142.312481] CPU3 failed to report alive state

  This is affecting the ubuntu_kernel_selftests/cpu-hotplug:cpu-on-off-
  test.sh and ubuntu_ltp/cpuhotplug:cpuhotplug02, cpuhotplug03,
  cpuhotplug04, cpuhotplug06

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2044334/+subscriptions



Follow ups