yahoo-eng-team team mailing list archive

Thread
Date

[Bug 2109457] [NEW] nova compute service won't start in a hyper-converged environment when there is an existing kvm unit on a nova compute node

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Shunde Zhang <2109457@xxxxxxxxxxxxxxxxxx>
Date: Mon, 28 Apr 2025 02:45:38 -0000
Reply-to: Bug 2109457 <2109457@xxxxxxxxxxxxxxxxxx>
Sender: noreply@xxxxxxxxxxxxx

Public bug reported:

When deploying a hyper-converged openstack 2024.1 using juju, nova
compute service won't start if a control plane service unit is also
deployed on the nova compute node as a KVM vm. It is OK if the control
plane unit is deployed as an LXD container.

The reason is nova compute will check if there is existing VM in the host [1].
If juju starts a VM (e.g. when there is "to kvm:X") and the VM is managed by juju, nova compute doesn't recognise it and fails the sanity check in [1].

A workaround is to bypass the check so nova compute can start for the
first time.

$ sudo cp -p /usr/lib/python3/dist-packages/nova/compute/manager.py /usr/lib/python3/dist-packages/nova/compute/manager.py.bak
$ sudo sed -i 's|if len(instances_on_hv) > 0|if len(instances_on_hv) > 999|g' /usr/lib/python3/dist-packages/nova/compute/manager.py
$ sudo systemctl restart nova-compute
$ sudo mv -f /usr/lib/python3/dist-packages/nova/compute/manager.py.bak /usr/lib/python3/dist-packages/nova/compute/manager.py

Also this check only runs for new nova compute services. If the nova
compute service has a record in the db, this check will not run again.
Thus there is no need to apply this workaround for existing nova compute
nodes afterwards.

Please check if the code [1] needs to be revised to run in a hyper-
converged environment.

[1]
https://github.com/openstack/nova/blob/stable/2024.1/nova/compute/manager.py#L1574

** Affects: nova
Importance: Undecided
Status: New

** Description changed:

The reason is nova compute will check if there is existing VM in the host [1].
- If juju starts a VM (e.g. using "to kvm:X") and the VM is managed by juju, nova compute doesn't recognise it and fails the check in [1].
+ If juju starts a VM (e.g. when there is "to kvm:X") and the VM is managed by juju, nova compute doesn't recognise it and fails the check in [1].

A workaround is to bypass the check so nova compute can start for the
first time.

Also this check only runs for new nova compute services. If the nova
compute service has a record in the db, this check will not run again.

Please check if the code [1] needs to be revised to run in a hyper-
converged environment.

[1]
https://github.com/openstack/nova/blob/stable/2024.1/nova/compute/manager.py#L1574

** Description changed:

The reason is nova compute will check if there is existing VM in the host [1].
- If juju starts a VM (e.g. when there is "to kvm:X") and the VM is managed by juju, nova compute doesn't recognise it and fails the check in [1].
+ If juju starts a VM (e.g. when there is "to kvm:X") and the VM is managed by juju, nova compute doesn't recognise it and fails the sanity check in [1].

A workaround is to bypass the check so nova compute can start for the
first time.

Also this check only runs for new nova compute services. If the nova
compute service has a record in the db, this check will not run again.

Please check if the code [1] needs to be revised to run in a hyper-
converged environment.

[1]
https://github.com/openstack/nova/blob/stable/2024.1/nova/compute/manager.py#L1574

** Description changed:

A workaround is to bypass the check so nova compute can start for the
first time.

Also this check only runs for new nova compute services. If the nova
compute service has a record in the db, this check will not run again.
+ Thus there is no need to apply this workaround for existing nova compute
+ nodes afterwards.

Please check if the code [1] needs to be revised to run in a hyper-
converged environment.

[1]
https://github.com/openstack/nova/blob/stable/2024.1/nova/compute/manager.py#L1574

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2109457

Title:
nova compute service won't start in a hyper-converged environment when
there is an existing kvm unit on a nova compute node

Status in OpenStack Compute (nova):
New

Bug description:
When deploying a hyper-converged openstack 2024.1 using juju, nova
compute service won't start if a control plane service unit is also
deployed on the nova compute node as a KVM vm. It is OK if the control
plane unit is deployed as an LXD container.

A workaround is to bypass the check so nova compute can start for the
first time.

Please check if the code [1] needs to be revised to run in a hyper-
converged environment.

[1]
https://github.com/openstack/nova/blob/stable/2024.1/nova/compute/manager.py#L1574

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2109457/+subscriptions

Follow ups

[Bug 2109457] Re: nova compute service won't start in a hyper-converged environment when there is an existing kvm unit on a nova compute node
From: Launchpad Bug Tracker, 2025-07-14