← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1406093] [NEW] Error message lacks when compute node management IP differs with its config address.

 

Public bug reported:

Summary: 
when the Management IP address of a compute node differs with its "my_ip" config of nova.conf, nova-conductor would prefer the "my_ip" config and store it into the "compute_nodes" table "host_ip" column of nova database. Certainly, this would lead to the failure of this compute node, but it's hard to notice, because it turns out good in the "nova service-list" command, and nowhere to see the cause, even the log didn't say anything meaningful.

Scenario: 
When you dulicate a new compute node using snapshots, this problem easily happens.

Example:
1) Here's an example, we have two compute nodes named ly-compute1 and ly-compute2. The "my_ip" of ly-compute1 is misconfigured because 10.0.0.35 does not exist.
*********************************************************************
ly-compute1 (Compute Node 1)
*********************************************************************
Management IP:
Ethernet(eth1)
IP Address: 10.0.0.31

nova.conf:
[DEFAULT]
my_ip = 10.0.0.35
*********************************************************************
ly-compute2 (Compute Node 2)
*********************************************************************
Management IP:
Ethernet(eth1)
IP Address: 10.0.0.32

nova.conf:
[DEFAULT]
my_ip = 10.0.0.32
*********************************************************************

2) Here's a fraction of "compute_nodes" table in nova database, from which we can see that, nova has got the wrong my_ip value.
*********************************************************************
nova database:
hypervisor_hostname	deleted		host_ip
ly-compute1			0			10.0.0.35
ly-compute2			0			10.0.0.32
*********************************************************************

3) However, the "nova service-list" says everything's ok as belows,
dashboard would fail all VNC connections to VMs running on ly-compute1,
but there's no information telling me what's going on, nova-compute.log
on ly-compute1 doesn't say anything meaningful either.

C:\Windows\system32>nova service-list
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| 1  | nova-cert        | ly-controller | internal | enabled | up    | 2014-12-28T12:08:47.000000 | -               |
| 2  | nova-consoleauth | ly-controller | internal | enabled | up    | 2014-12-28T12:08:53.000000 | -               |
| 3  | nova-scheduler   | ly-controller | internal | enabled | up    | 2014-12-28T12:08:53.000000 | -               |
| 4  | nova-conductor   | ly-controller | internal | enabled | up    | 2014-12-28T12:08:52.000000 | -               |
| 5  | nova-compute     | ly-compute1   | nova     | enabled | up    | 2014-12-28T12:08:52.000000 | None            |
| 6  | nova-network     | ly-compute1   | internal | enabled | up    | 2014-12-28T12:08:52.000000 | -               |
| 7  | nova-network     | ly-compute2   | internal | enabled | up    | 2014-12-28T12:08:53.000000 | -               |
| 8  | nova-compute     | ly-compute2   | nova     | enabled | up    | 2014-12-28T12:08:47.000000 | None            |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

Request:
I think that a clearer error message should be presented about this misconfiguration, like adding some code checking if Management IP of a compute node is the same with its my_ip, or just let the nova-compute service down, so the admin would know where the fault locates, using command such as "nova service-list".

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: conductor db

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1406093

Title:
  Error message lacks when compute node management IP differs with its
  config address.

Status in OpenStack Compute (Nova):
  New

Bug description:
  Summary: 
  when the Management IP address of a compute node differs with its "my_ip" config of nova.conf, nova-conductor would prefer the "my_ip" config and store it into the "compute_nodes" table "host_ip" column of nova database. Certainly, this would lead to the failure of this compute node, but it's hard to notice, because it turns out good in the "nova service-list" command, and nowhere to see the cause, even the log didn't say anything meaningful.

  Scenario: 
  When you dulicate a new compute node using snapshots, this problem easily happens.

  Example:
  1) Here's an example, we have two compute nodes named ly-compute1 and ly-compute2. The "my_ip" of ly-compute1 is misconfigured because 10.0.0.35 does not exist.
  *********************************************************************
  ly-compute1 (Compute Node 1)
  *********************************************************************
  Management IP:
  Ethernet(eth1)
  IP Address: 10.0.0.31

  nova.conf:
  [DEFAULT]
  my_ip = 10.0.0.35
  *********************************************************************
  ly-compute2 (Compute Node 2)
  *********************************************************************
  Management IP:
  Ethernet(eth1)
  IP Address: 10.0.0.32

  nova.conf:
  [DEFAULT]
  my_ip = 10.0.0.32
  *********************************************************************

  2) Here's a fraction of "compute_nodes" table in nova database, from which we can see that, nova has got the wrong my_ip value.
  *********************************************************************
  nova database:
  hypervisor_hostname	deleted		host_ip
  ly-compute1			0			10.0.0.35
  ly-compute2			0			10.0.0.32
  *********************************************************************

  3) However, the "nova service-list" says everything's ok as belows,
  dashboard would fail all VNC connections to VMs running on ly-
  compute1, but there's no information telling me what's going on, nova-
  compute.log on ly-compute1 doesn't say anything meaningful either.

  C:\Windows\system32>nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 1  | nova-cert        | ly-controller | internal | enabled | up    | 2014-12-28T12:08:47.000000 | -               |
  | 2  | nova-consoleauth | ly-controller | internal | enabled | up    | 2014-12-28T12:08:53.000000 | -               |
  | 3  | nova-scheduler   | ly-controller | internal | enabled | up    | 2014-12-28T12:08:53.000000 | -               |
  | 4  | nova-conductor   | ly-controller | internal | enabled | up    | 2014-12-28T12:08:52.000000 | -               |
  | 5  | nova-compute     | ly-compute1   | nova     | enabled | up    | 2014-12-28T12:08:52.000000 | None            |
  | 6  | nova-network     | ly-compute1   | internal | enabled | up    | 2014-12-28T12:08:52.000000 | -               |
  | 7  | nova-network     | ly-compute2   | internal | enabled | up    | 2014-12-28T12:08:53.000000 | -               |
  | 8  | nova-compute     | ly-compute2   | nova     | enabled | up    | 2014-12-28T12:08:47.000000 | None            |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Request:
  I think that a clearer error message should be presented about this misconfiguration, like adding some code checking if Management IP of a compute node is the same with its my_ip, or just let the nova-compute service down, so the admin would know where the fault locates, using command such as "nova service-list".

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1406093/+subscriptions


Follow ups

References