← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2029335] Re: [centos-9-stream] jobs fails as nova-compute stuck at libvirt connect since systemd-252-16.el9

 

Marking invalid, impacted Ironic but fix was in devstack.

** Changed in: ironic
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2029335

Title:
  [centos-9-stream] jobs fails as nova-compute stuck at libvirt connect
  since systemd-252-16.el9

Status in devstack:
  Fix Released
Status in Ironic:
  Invalid
Status in neutron:
  New

Bug description:
  CentOS 9-stream jobs fails as below:-
  2023-08-01 04:38:23.344538 | controller | + functions:wait_for_compute:494           :   echo 'Didn'\''t find service registered by hostname after 60 seconds'
  2023-08-01 04:38:23.344586 | controller | Didn't find service registered by hostname after 60 seconds
  2023-08-01 04:38:23.347431 | controller | + functions:wait_for_compute:495           :   openstack --os-cloud devstack-admin --os-region RegionOne compute service list
  2023-08-01 04:38:24.703419 | controller | +--------------------------------------+----------------+--------------+----------+---------+-------+----------------------------+
  2023-08-01 04:38:24.703477 | controller | | ID                                   | Binary         | Host         | Zone     | Status  | State | Updated At                 |
  2023-08-01 04:38:24.703483 | controller | +--------------------------------------+----------------+--------------+----------+---------+-------+----------------------------+
  2023-08-01 04:38:24.703488 | controller | | f00443c2-4813-4f38-b13e-8694c6cabe58 | nova-conductor | np0034822320 | internal | enabled | up    | 2023-08-01T04:38:23.000000 |
  2023-08-01 04:38:24.703492 | controller | | ed6b12c6-25ee-46a8-b3ae-2ccf523ae39e | nova-scheduler | np0034822320 | internal | enabled | up    | 2023-08-01T04:38:15.000000 |
  2023-08-01 04:38:24.703497 | controller | | de8e03bc-140f-4ee9-ba3a-dc3f11807ec6 | nova-conductor | np0034822320 | internal | enabled | up    | 2023-08-01T04:38:21.000000 |
  2023-08-01 04:38:24.703501 | controller | +--------------------------------------+----------------+--------------+----------+---------+-------+----------------------------+
  2023-08-01 04:38:24.915144 | controller | + functions:wait_for_compute:497           :   return 124
  2023-08-01 04:38:24.918022 | controller | + lib/nova:is_nova_ready:1                 :   exit_trap
  2023-08-01 04:38:24.920886 | controller | + ./stack.sh:exit_trap:550                 :   local r=124

  It's because nova-compute stuck at start while Connecting to libvirt:
  qemu:///system.

  Example logs:-
  - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e50/periodic/opendev.org/openstack/neutron/master/neutron-ovn-tempest-ovs-master-centos-9-stream/e5086a4/controller/logs/devstacklog.txt
  - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e50/periodic/opendev.org/openstack/neutron/master/neutron-ovn-tempest-ovs-master-centos-9-stream/e5086a4/controller/logs/screen-n-cpu.txt
  - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e50/periodic/opendev.org/openstack/neutron/master/neutron-ovn-tempest-ovs-master-centos-9-stream/e5086a4/controller/logs/services.txt
  libvirtd status shows:- Main PID: 52485 (code=exited, status=0/SUCCESS)

  This is a regression caused in systemd with
  https://github.com/systemd/systemd/commit/ff32060f2ed37b68dc26256b05e2e69013b0ecfe
  and this is included as part of systemd-252-16.el9 in CentOS 9-stream.

  It's reverted in systemd as part of
  https://github.com/systemd/systemd/pull/28000

  Found a known issue in RHEL 9 and to consider the backport of systemd
  revert https://bugzilla.redhat.com/show_bug.cgi?id=2225667

  The workaround for now can be any of:-
  - Downgrade systemd to good version i.e systemd-252-16 - to avoid the regression
  - Restart libvirtd - to trigger respawn of processes
  - kill dnsmasq processes - to trigger respawn of processes
  - Configure libvirtd to not use --timeout 120 so the process don't exit

  Builds:- https://zuul.openstack.org/builds?job_name=tempest-full-
  centos-9-stream&job_name=devstack-platform-
  centos-9-stream&job_name=neutron-ovn-tempest-ovs-master-
  centos-9-stream&job_name=neutron-ovn-tempest-ovs-release-
  fips&branch=master&skip=0

To manage notifications about this bug go to:
https://bugs.launchpad.net/devstack/+bug/2029335/+subscriptions