← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2039940] [NEW] test_resize_volume_backed_server_confirm which fails randomly with Kernel panic

 

Public bug reported:

tempest.api.compute.servers.test_server_actions.ServerActionsTestOtherA.test_resize_volume_backed_server_confirm
fails randomly with:-

Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
    ssh.connect(self.host, port=self.port, username=self.username,
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/paramiko/client.py", line 386, in connect
    sock.connect(addr)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/lib/decorators.py", line 106, in wrapper
    raise exc
  File "/opt/stack/tempest/tempest/lib/decorators.py", line 98, in wrapper
    return f(*func_args, **func_kwargs)
  File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
    return f(*func_args, **func_kwargs)
  File "/opt/stack/tempest/tempest/api/compute/servers/test_server_actions.py", line 510, in test_resize_volume_backed_server_confirm
    linux_client.validate_authentication()
  File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
    return function(self, *args, **kwargs)
  File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 123, in validate_authentication
    self.ssh_client.test_connection_auth()
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 245, in test_connection_auth
    connection = self._get_ssh_connection()
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
    raise exceptions.SSHTimeout(host=self.host,
tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.65 via SSH timed out.
User: cirros, Password: None

Guest console log says it's kernel panic:-
info: initramfs: up at 7.36
[    8.403290] virtio_blk virtio2: [vda] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
[    8.440219] GPT:Primary header thinks Alt. header is not at the end of the disk.
[    8.440726] GPT:229375 != 2097151
[    8.440967] GPT:Alternate GPT header not at the end of the disk.
[    8.441252] GPT:229375 != 2097151
[    8.441503] GPT: Use GNU Parted to correct GPT errors.
[    8.974064] virtio_gpu virtio0: [drm] drm_plane_enable_fb_damage_clips() not called
[    9.068224] random: crng init done
currently loaded modules: 8021q 8139cp 8390 9pnet 9pnet_virtio ahci cec drm drm_kms_helper e1000 e1000e failover fb_sys_fops garp hid hid_generic ip6_udp_tunnel ip_tables isofs libahci libcrc32c llc mii mrp ne2k_pci net_failover nls_ascii nls_iso8859_1 nls_utf8 pcnet32 qemu_fw_cfg rc_core sctp stp syscopyarea sysfillrect sysimgblt udp_tunnel usbhid virtio_blk virtio_dma_buf virtio_gpu virtio_input virtio_net virtio_rng virtio_scsi x_tables 
info: initramfs loading root from /dev/vda1
/sbin/init: can't load library 'libtirpc.so.3'
[   11.288963] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00001000
[   11.290203] CPU: 0 PID: 1 Comm: init Not tainted 5.15.0-71-generic #78-Ubuntu
[   11.290952] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.15.0-1 04/01/2014
[   11.291870] Call Trace:
[   11.292973]  <TASK>
[   11.293458]  show_stack+0x52/0x5c
[   11.294280]  dump_stack_lvl+0x4a/0x63
[   11.294720]  dump_stack+0x10/0x16
[   11.295179]  panic+0x15c/0x334
[   11.295587]  ? exit_to_user_mode_prepare+0x37/0xb0
[   11.296118]  do_exit.cold+0x15/0xa0
[   11.296460]  __x64_sys_exit+0x1b/0x20
[   11.296880]  do_syscall_64+0x5c/0xc0
[   11.297283]  ? ksys_write+0x67/0xf0
[   11.297672]  ? exit_to_user_mode_prepare+0x37/0xb0
[   11.298172]  ? syscall_exit_to_user_mode+0x27/0x50
[   11.298683]  ? __x64_sys_write+0x19/0x20
[   11.299151]  ? do_syscall_64+0x69/0xc0
[   11.299644]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   11.300611] RIP: 0033:0x7f147a37555e
[   11.301938] Code: 05 d7 2a 00 00 4c 89 f9 bf 02 00 00 00 48 8d 35 fb 0d 00 00 48 8b 10 31 c0 e8 50 d2 ff ff bf 10 00 00 00 b8 3c 00 00 00 0f 05 <48> 8d 15 f3 2a 00 00 f7 d8 89 02 48 83 ec 20 49 8b 8c 24 b8 00 00
[   11.312012] RSP: 002b:00007fff85488500 EFLAGS: 00000207 ORIG_RAX: 000000000000003c
[   11.318360] RAX: ffffffffffffffda RBX: 00007fff854897b0 RCX: 00007f147a37555e
[   11.324215] RDX: 0000000000000002 RSI: 0000000000001000 RDI: 0000000000000010
[   11.331344] RBP: 00007fff85489790 R08: 00007f147a36e000 R09: 00007f147a36e01a
[   11.338406] R10: 0000000000000001 R11: 0000000000000207 R12: 00007f147a36f040
[   11.347090] R13: 00000000004bae50 R14: 0000000000000000 R15: 0000000000403d66
[   11.354220]  </TASK>
[   11.362227] Kernel Offset: 0x36400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   11.369248] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00001000 ]---

As per opensearch[1] there are 16 hits in last 12 days across multiple
jobs in master/stable2023.2 branch.

Jobs:-
tempest-integrated-networking 31.3%
cinder-tempest-plugin-lvm-lio-barbican-fips 18.8%
cinder-tempest-plugin-lvm-lio-barbican 12.5%
tempest-integrated-storage 12.5%
nova-ceph-multistore 6.3%

Branches:-
master 93.8%
stable/2023.2 6.3%

Found an old issue in different test
https://bugs.launchpad.net/tempest/+bug/1888224 but that mentioned was
for arm64.

Example builds:-
- http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_ff8/898137/1/check/nova-ceph-multistore/ff880e4/testr_results.html
- https://d941f4f9ec28b33784a2-4d240979ebaa10fc274be1cd05b244a9.ssl.cf5.rackcdn.com/893025/3/check/tempest-integrated-networking/7ed5444/testr_results.html


[1] https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'94869730-aea8-11ec-9e6a-83741af3fdcd',key:filename,negate:!f,params:(query:job-output.txt),type:phrase),query:(match_phrase:(filename:job-output.txt)))),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22load%20library%20!'libtirpc.so.3!'%22'),sort:!())

** Affects: neutron
     Importance: Undecided
         Status: New

** Affects: tempest
     Importance: Undecided
         Status: New

** Also affects: tempest
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2039940

Title:
  test_resize_volume_backed_server_confirm which fails randomly with
  Kernel panic

Status in neutron:
  New
Status in tempest:
  New

Bug description:
  tempest.api.compute.servers.test_server_actions.ServerActionsTestOtherA.test_resize_volume_backed_server_confirm
  fails randomly with:-

  Traceback (most recent call last):
    File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
      ssh.connect(self.host, port=self.port, username=self.username,
    File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/paramiko/client.py", line 386, in connect
      sock.connect(addr)
  TimeoutError: timed out

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/opt/stack/tempest/tempest/lib/decorators.py", line 106, in wrapper
      raise exc
    File "/opt/stack/tempest/tempest/lib/decorators.py", line 98, in wrapper
      return f(*func_args, **func_kwargs)
    File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
      return f(*func_args, **func_kwargs)
    File "/opt/stack/tempest/tempest/api/compute/servers/test_server_actions.py", line 510, in test_resize_volume_backed_server_confirm
      linux_client.validate_authentication()
    File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
      return function(self, *args, **kwargs)
    File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 123, in validate_authentication
      self.ssh_client.test_connection_auth()
    File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 245, in test_connection_auth
      connection = self._get_ssh_connection()
    File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
      raise exceptions.SSHTimeout(host=self.host,
  tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.65 via SSH timed out.
  User: cirros, Password: None

  Guest console log says it's kernel panic:-
  info: initramfs: up at 7.36
  [    8.403290] virtio_blk virtio2: [vda] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
  [    8.440219] GPT:Primary header thinks Alt. header is not at the end of the disk.
  [    8.440726] GPT:229375 != 2097151
  [    8.440967] GPT:Alternate GPT header not at the end of the disk.
  [    8.441252] GPT:229375 != 2097151
  [    8.441503] GPT: Use GNU Parted to correct GPT errors.
  [    8.974064] virtio_gpu virtio0: [drm] drm_plane_enable_fb_damage_clips() not called
  [    9.068224] random: crng init done
  currently loaded modules: 8021q 8139cp 8390 9pnet 9pnet_virtio ahci cec drm drm_kms_helper e1000 e1000e failover fb_sys_fops garp hid hid_generic ip6_udp_tunnel ip_tables isofs libahci libcrc32c llc mii mrp ne2k_pci net_failover nls_ascii nls_iso8859_1 nls_utf8 pcnet32 qemu_fw_cfg rc_core sctp stp syscopyarea sysfillrect sysimgblt udp_tunnel usbhid virtio_blk virtio_dma_buf virtio_gpu virtio_input virtio_net virtio_rng virtio_scsi x_tables 
  info: initramfs loading root from /dev/vda1
  /sbin/init: can't load library 'libtirpc.so.3'
  [   11.288963] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00001000
  [   11.290203] CPU: 0 PID: 1 Comm: init Not tainted 5.15.0-71-generic #78-Ubuntu
  [   11.290952] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.15.0-1 04/01/2014
  [   11.291870] Call Trace:
  [   11.292973]  <TASK>
  [   11.293458]  show_stack+0x52/0x5c
  [   11.294280]  dump_stack_lvl+0x4a/0x63
  [   11.294720]  dump_stack+0x10/0x16
  [   11.295179]  panic+0x15c/0x334
  [   11.295587]  ? exit_to_user_mode_prepare+0x37/0xb0
  [   11.296118]  do_exit.cold+0x15/0xa0
  [   11.296460]  __x64_sys_exit+0x1b/0x20
  [   11.296880]  do_syscall_64+0x5c/0xc0
  [   11.297283]  ? ksys_write+0x67/0xf0
  [   11.297672]  ? exit_to_user_mode_prepare+0x37/0xb0
  [   11.298172]  ? syscall_exit_to_user_mode+0x27/0x50
  [   11.298683]  ? __x64_sys_write+0x19/0x20
  [   11.299151]  ? do_syscall_64+0x69/0xc0
  [   11.299644]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
  [   11.300611] RIP: 0033:0x7f147a37555e
  [   11.301938] Code: 05 d7 2a 00 00 4c 89 f9 bf 02 00 00 00 48 8d 35 fb 0d 00 00 48 8b 10 31 c0 e8 50 d2 ff ff bf 10 00 00 00 b8 3c 00 00 00 0f 05 <48> 8d 15 f3 2a 00 00 f7 d8 89 02 48 83 ec 20 49 8b 8c 24 b8 00 00
  [   11.312012] RSP: 002b:00007fff85488500 EFLAGS: 00000207 ORIG_RAX: 000000000000003c
  [   11.318360] RAX: ffffffffffffffda RBX: 00007fff854897b0 RCX: 00007f147a37555e
  [   11.324215] RDX: 0000000000000002 RSI: 0000000000001000 RDI: 0000000000000010
  [   11.331344] RBP: 00007fff85489790 R08: 00007f147a36e000 R09: 00007f147a36e01a
  [   11.338406] R10: 0000000000000001 R11: 0000000000000207 R12: 00007f147a36f040
  [   11.347090] R13: 00000000004bae50 R14: 0000000000000000 R15: 0000000000403d66
  [   11.354220]  </TASK>
  [   11.362227] Kernel Offset: 0x36400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  [   11.369248] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00001000 ]---

  As per opensearch[1] there are 16 hits in last 12 days across multiple
  jobs in master/stable2023.2 branch.

  Jobs:-
  tempest-integrated-networking 31.3%
  cinder-tempest-plugin-lvm-lio-barbican-fips 18.8%
  cinder-tempest-plugin-lvm-lio-barbican 12.5%
  tempest-integrated-storage 12.5%
  nova-ceph-multistore 6.3%

  Branches:-
  master 93.8%
  stable/2023.2 6.3%

  Found an old issue in different test
  https://bugs.launchpad.net/tempest/+bug/1888224 but that mentioned was
  for arm64.

  Example builds:-
  - http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_ff8/898137/1/check/nova-ceph-multistore/ff880e4/testr_results.html
  - https://d941f4f9ec28b33784a2-4d240979ebaa10fc274be1cd05b244a9.ssl.cf5.rackcdn.com/893025/3/check/tempest-integrated-networking/7ed5444/testr_results.html

  
  [1] https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'94869730-aea8-11ec-9e6a-83741af3fdcd',key:filename,negate:!f,params:(query:job-output.txt),type:phrase),query:(match_phrase:(filename:job-output.txt)))),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22load%20library%20!'libtirpc.so.3!'%22'),sort:!())

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2039940/+subscriptions