yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #26507
[Bug 1406093] [NEW] Error message lacks when compute node management IP differs with its config address.
Public bug reported:
Summary:
when the Management IP address of a compute node differs with its "my_ip" config of nova.conf, nova-conductor would prefer the "my_ip" config and store it into the "compute_nodes" table "host_ip" column of nova database. Certainly, this would lead to the failure of this compute node, but it's hard to notice, because it turns out good in the "nova service-list" command, and nowhere to see the cause, even the log didn't say anything meaningful.
Scenario:
When you dulicate a new compute node using snapshots, this problem easily happens.
Example:
1) Here's an example, we have two compute nodes named ly-compute1 and ly-compute2. The "my_ip" of ly-compute1 is misconfigured because 10.0.0.35 does not exist.
*********************************************************************
ly-compute1 (Compute Node 1)
*********************************************************************
Management IP:
Ethernet(eth1)
IP Address: 10.0.0.31
nova.conf:
[DEFAULT]
my_ip = 10.0.0.35
*********************************************************************
ly-compute2 (Compute Node 2)
*********************************************************************
Management IP:
Ethernet(eth1)
IP Address: 10.0.0.32
nova.conf:
[DEFAULT]
my_ip = 10.0.0.32
*********************************************************************
2) Here's a fraction of "compute_nodes" table in nova database, from which we can see that, nova has got the wrong my_ip value.
*********************************************************************
nova database:
hypervisor_hostname deleted host_ip
ly-compute1 0 10.0.0.35
ly-compute2 0 10.0.0.32
*********************************************************************
3) However, the "nova service-list" says everything's ok as belows,
dashboard would fail all VNC connections to VMs running on ly-compute1,
but there's no information telling me what's going on, nova-compute.log
on ly-compute1 doesn't say anything meaningful either.
C:\Windows\system32>nova service-list
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| 1 | nova-cert | ly-controller | internal | enabled | up | 2014-12-28T12:08:47.000000 | - |
| 2 | nova-consoleauth | ly-controller | internal | enabled | up | 2014-12-28T12:08:53.000000 | - |
| 3 | nova-scheduler | ly-controller | internal | enabled | up | 2014-12-28T12:08:53.000000 | - |
| 4 | nova-conductor | ly-controller | internal | enabled | up | 2014-12-28T12:08:52.000000 | - |
| 5 | nova-compute | ly-compute1 | nova | enabled | up | 2014-12-28T12:08:52.000000 | None |
| 6 | nova-network | ly-compute1 | internal | enabled | up | 2014-12-28T12:08:52.000000 | - |
| 7 | nova-network | ly-compute2 | internal | enabled | up | 2014-12-28T12:08:53.000000 | - |
| 8 | nova-compute | ly-compute2 | nova | enabled | up | 2014-12-28T12:08:47.000000 | None |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
Request:
I think that a clearer error message should be presented about this misconfiguration, like adding some code checking if Management IP of a compute node is the same with its my_ip, or just let the nova-compute service down, so the admin would know where the fault locates, using command such as "nova service-list".
** Affects: nova
Importance: Undecided
Status: New
** Tags: conductor db
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1406093
Title:
Error message lacks when compute node management IP differs with its
config address.
Status in OpenStack Compute (Nova):
New
Bug description:
Summary:
when the Management IP address of a compute node differs with its "my_ip" config of nova.conf, nova-conductor would prefer the "my_ip" config and store it into the "compute_nodes" table "host_ip" column of nova database. Certainly, this would lead to the failure of this compute node, but it's hard to notice, because it turns out good in the "nova service-list" command, and nowhere to see the cause, even the log didn't say anything meaningful.
Scenario:
When you dulicate a new compute node using snapshots, this problem easily happens.
Example:
1) Here's an example, we have two compute nodes named ly-compute1 and ly-compute2. The "my_ip" of ly-compute1 is misconfigured because 10.0.0.35 does not exist.
*********************************************************************
ly-compute1 (Compute Node 1)
*********************************************************************
Management IP:
Ethernet(eth1)
IP Address: 10.0.0.31
nova.conf:
[DEFAULT]
my_ip = 10.0.0.35
*********************************************************************
ly-compute2 (Compute Node 2)
*********************************************************************
Management IP:
Ethernet(eth1)
IP Address: 10.0.0.32
nova.conf:
[DEFAULT]
my_ip = 10.0.0.32
*********************************************************************
2) Here's a fraction of "compute_nodes" table in nova database, from which we can see that, nova has got the wrong my_ip value.
*********************************************************************
nova database:
hypervisor_hostname deleted host_ip
ly-compute1 0 10.0.0.35
ly-compute2 0 10.0.0.32
*********************************************************************
3) However, the "nova service-list" says everything's ok as belows,
dashboard would fail all VNC connections to VMs running on ly-
compute1, but there's no information telling me what's going on, nova-
compute.log on ly-compute1 doesn't say anything meaningful either.
C:\Windows\system32>nova service-list
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| 1 | nova-cert | ly-controller | internal | enabled | up | 2014-12-28T12:08:47.000000 | - |
| 2 | nova-consoleauth | ly-controller | internal | enabled | up | 2014-12-28T12:08:53.000000 | - |
| 3 | nova-scheduler | ly-controller | internal | enabled | up | 2014-12-28T12:08:53.000000 | - |
| 4 | nova-conductor | ly-controller | internal | enabled | up | 2014-12-28T12:08:52.000000 | - |
| 5 | nova-compute | ly-compute1 | nova | enabled | up | 2014-12-28T12:08:52.000000 | None |
| 6 | nova-network | ly-compute1 | internal | enabled | up | 2014-12-28T12:08:52.000000 | - |
| 7 | nova-network | ly-compute2 | internal | enabled | up | 2014-12-28T12:08:53.000000 | - |
| 8 | nova-compute | ly-compute2 | nova | enabled | up | 2014-12-28T12:08:47.000000 | None |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
Request:
I think that a clearer error message should be presented about this misconfiguration, like adding some code checking if Management IP of a compute node is the same with its my_ip, or just let the nova-compute service down, so the admin would know where the fault locates, using command such as "nova service-list".
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1406093/+subscriptions
Follow ups
References