yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #17103
[Bug 1175483] Re: use socket.getfqdn() to replace gethostname() for param host
See discussion at https://review.openstack.org/#/c/28015/
** Changed in: nova
Status: In Progress => Invalid
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1175483
Title:
use socket.getfqdn() to replace gethostname() for param host
Status in OpenStack Compute (Nova):
Invalid
Bug description:
We have a openstack cluster which run with nova-network in mulit-host
mode and the version is Folsom.
Yesterday, our server shutdown cause of power failure and boot
after the power recovered.
Then we find one compute node ymy-r1-9 didn't bring up the bridge br100 and /var/lib/nova/networks/br100.conf is blank. Then I reboot the nova-network and find the br100's ip has changed from 192.168.2.14 to 192.168.2.29.
In ordinary,it should not be changed.
We look up in the
nova/db/sqlalchemy/api.py:network_get_associated_fixed_ips, it has a
param named host, if it's not None, then the function will return all
the fix_ip of the vm on that host. We have 8 vm running on that host,
but actually it return None.
After we look up the database,we find the hostname has changed:
mysql> select * from services;
+---------------------+---------------------+------------+---------+----+---------------------+------------------+-------------+--------------+----------+-------------------+
| created_at | updated_at | deleted_at | deleted | id | host | binary | topic | report_count | disabled | availability_zone |
+---------------------+---------------------+------------+---------+----+---------------------+------------------+-------------+--------------+----------+-------------------+
| 2013-04-22 11:43:13 | 2013-04-29 09:14:59 | NULL | 0 | 1 | ymy-r1-9 | nova-consoleauth | consoleauth | 62059 | 0 | nova |
| 2013-04-22 11:43:15 | 2013-04-29 09:14:59 | NULL | 0 | 2 | ymy-r1-9 | nova-scheduler | scheduler | 62058 | 0 | nova |
| 2013-04-22 11:43:21 | 2013-04-29 09:14:59 | NULL | 0 | 3 | ymy-r1-9 | nova-network | network | 61926 | 0 | nova |
| 2013-04-22 11:43:26 | 2013-04-29 09:14:59 | NULL | 0 | 4 | ymy-r1-9 | nova-cert | cert | 62060 | 0 | nova |
| 2013-04-22 06:26:08 | 2013-05-02 07:11:14 | NULL | 0 | 5 | ymy-r1-7.production.com | nova-compute | compute | 86015 | 0 | nova |
| 2013-04-22 06:26:09 | 2013-05-02 07:11:12 | NULL | 0 | 6 | ymy-r1-7.production.com | nova-network | network | 86009 | 0 | nova |
| 2013-04-22 06:32:44 | 2013-05-02 07:11:18 | NULL | 0 | 7 | ymy-r1-8 | nova-compute | compute | 85855 | 0 | nova |
| 2013-04-22 06:32:46 | 2013-05-02 07:11:14 | NULL | 0 | 8 | ymy-r1-8 | nova-network | network | 85856 | 0 | nova |
| 2013-04-22 06:55:25 | 2013-04-29 09:14:59 | NULL | 0 | 9 | ymy-r1-9 | nova-console | console | 60956 | 0 | nova |
| 2013-04-23 07:07:42 | 2013-04-29 09:15:00 | NULL | 0 | 10 | ymy-r1-9 | nova-compute | compute | 52116 | 0 | nova |
| 2013-04-29 09:28:57 | 2013-05-02 07:11:12 | NULL | 0 | 11 | ymy-r1-9.production.com | nova-consoleauth | consoleauth | 24905 | 0 | nova |
| 2013-04-29 09:28:57 | 2013-05-02 07:11:14 | NULL | 0 | 12 | ymy-r1-9.production.com | nova-cert | cert | 24906 | 0 | nova |
| 2013-04-29 09:28:57 | 2013-05-02 07:11:13 | NULL | 0 | 13 | ymy-r1-9.production.com | nova-scheduler | scheduler | 24907 | 0 | nova |
| 2013-04-29 09:28:57 | 2013-05-02 07:11:12 | NULL | 0 | 14 | ymy-r1-9.production.com | nova-compute | compute | 24886 | 0 | nova |
| 2013-04-29 09:28:59 | 2013-05-02 07:11:13 | NULL | 0 | 15 | ymy-r1-9.production.com | nova-network | network | 24863 | 0 | nova |
+---------------------+---------------------+------------+---------+----+---------------------+------------------+-------------+--------------+----------+-------------------+
The hostname has changed from ymy-r1-9 to ymy-r1-9.production.com.
Now if we search the host by name 'ymy-r1-9':
mysql> select * from instances where host='ymy-r1-9';
....
37 rows in set (0.00 sec)
But if we search the host by name 'ymy-r1-9.production.com':
select * from instances where host='ymy-r1-9.ustack.com';
Empty set (0.00 sec)
So that's the reason to lead nova-network doesn't work.
After we find the reason, we feel puzzled why the host value would
change. This param is defined in the flags.py:
cfg.StrOpt('host',
default=socket.gethostname(),
help='Name of this node. This can be an opaque identifier. '
'It is not necessarily a hostname, FQDN, or IP address. '
'However, the node name must be valid within '
'an AMQP key, and if using ZeroMQ, a valid '
'hostname, FQDN, or IP address'),
It use socket.gethostname() method to get the hostname .
I read the docs on the http://docs.python.org/2/library/socket.html, it says:
Note: gethostname() doesn’t always return the fully qualified domain name; use getfqdn()
I try this in three servers,one value is hostname,and others are fqdn.
So I think it should use socket.getfqdn() to avoid this problem.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1175483/+subscriptions