← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1328367] [NEW] Do not set vm error state when raise MigrationError

 

Public bug reported:

Control Node: 101.0.0.20(also has compute service , but do not use it)
Compute Node:  101.0.0.30

nova version:
2014.1.b2-847-ga891e04

in control node nova.conf
allow_resize_to_same_host = True
and
in compute node nova.conf
allow_resize_to_same_host = False

detail:
1. boot an instance in compute node
nova boot --image 51c4a908-c028-4ce2-bbd1-8b0e15d8d829 --flavor 84 --nic net-id=308840da-6440-4599-923a-2edd290971d3 --availability-zone nova:compute.localdomain migrate_test

2. resize it to flavor type 1
nova resize   migrate_test 1

3.the instance has set to error state when resize failed.

#nova list
+--------------------------------------+----------+--------+-------------+-------------+-------------------+
| a1424990-182a-4bc2-8c17-aa4808a49472 | migrate_test | ERROR  | resize_prep | Running     | private=20.0.0.15 |
+--------------------------------------+----------+--------+-------------+-------------+-------------------+

#nova show
....
| config_drive                         |                                                                                                                                                               |
| created                              | 2014-06-09T09:31:35Z                                                                                                                                          |
| fault                                | {"message": "<class 'nova.exception.MigrationError'>", "code": 500, "details": "  File \"/opt/stack/nova/nova/compute/manager.py\", line 3104, in prep_resize |
|                                      |     node)                                                                                                                                                     |
|                                      |   File \"/opt/stack/nova/nova/compute/manager.py\", line 3058, in _prep_resize                                                                                |
|                                      |     raise exception.MigrationError(msg)                                                                                                                       |
|                                      | ", "created": "2014-06-10T03:54:39Z"}
                                                                     |
| flavor                               | m1.micro (84)                                                                                                                                                 |
| hostId                               | f73013b029032929598a4a54586e4469c2c7cd676c147f6601f73c58
....

error log in compute node:

2014-06-10 11:54:48.372 ERROR nova.compute.manager [req-6a4ac25a-7d24-40c6-9f8d-435b4adb6fff admin admin] [instance: a1424990-182a
-4bc2-8c17-aa4808a49472] Setting instance vm_state to ERROR
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] Traceback (most recent call la
st):
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/c
ompute/manager.py", line 5231, in _error_out_instance_on_exception
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     yield
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/c
ompute/manager.py", line 3111, in prep_resize
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     filter_properties)
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/compute/manager.py", line 3104, in prep_resize
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     node)
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/compute/manager.py", line 3058, in _prep_resize
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     raise exception.MigrationError(msg)
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] MigrationError: destination same as source!
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]

bug reason:
1. nova-scheduler is allowed to scheduler to compute node (due to controller nova.conf)

2. but nova-compute is not allowed to resize in same host (due to
compute node nova.conf)

3.
a)compute side _prep_resize() function set instance into error state:
....
self._set_instance_error_state(context, instance['uuid'])
...
then raise exception

b)
compute node reschedule the instance again, failed again
....
self._reschedule_resize_or_reraise(context, image, instance,
     exc_info, instance_type, reservations, request_spec,
     filter_properties)
...
c)compute store instance fault info
....
compute_utils.add_instance_fault_from_exc(context, self.conductor_api,
    instance, exc_info[0], exc_info=exc_info)

additional:
no matter what the scheduler filter is using, instance should not be set to ERROR status just because scheduler doesn't find a appropriate host to do resize.
and we can not deal with  vm in ERROR state unless we change it's state in db

** Affects: nova
     Importance: Undecided
         Status: New

** Description changed:

  Control Node: 101.0.0.20(also has compute service , but do not use it)
  Compute Node:  101.0.0.30
  
  nova version:
  2014.1.b2-847-ga891e04
  
- in control node nova.conf 
+ in control node nova.conf
  allow_resize_to_same_host = True
- and 
+ and
  in compute node nova.conf
  allow_resize_to_same_host = False
  
  detail:
- 1. boot an instance in compute node 
+ 1. boot an instance in compute node
  nova boot --image 51c4a908-c028-4ce2-bbd1-8b0e15d8d829 --flavor 84 --nic net-id=308840da-6440-4599-923a-2edd290971d3 --availability-zone nova:compute.localdomain migrate_test
  
  2. resize it to flavor type 1
  nova resize   migrate_test 1
  
  3.the instance has set to error state when resize failed.
  
  #nova list
  +--------------------------------------+----------+--------+-------------+-------------+-------------------+
  | a1424990-182a-4bc2-8c17-aa4808a49472 | migrate_test | ERROR  | resize_prep | Running     | private=20.0.0.15 |
  +--------------------------------------+----------+--------+-------------+-------------+-------------------+
  
- #nova show 
+ #nova show
  ....
  | config_drive                         |                                                                                                                                                               |
  | created                              | 2014-06-09T09:31:35Z                                                                                                                                          |
  | fault                                | {"message": "<class 'nova.exception.MigrationError'>", "code": 500, "details": "  File \"/opt/stack/nova/nova/compute/manager.py\", line 3104, in prep_resize |
  |                                      |     node)                                                                                                                                                     |
  |                                      |   File \"/opt/stack/nova/nova/compute/manager.py\", line 3058, in _prep_resize                                                                                |
  |                                      |     raise exception.MigrationError(msg)                                                                                                                       |
- |                                      | ", "created": "2014-06-10T03:54:39Z"}                                                    
-                                                                      |
+ |                                      | ", "created": "2014-06-10T03:54:39Z"}
+                                                                      |
  | flavor                               | m1.micro (84)                                                                                                                                                 |
- | hostId                               | f73013b029032929598a4a54586e4469c2c7cd676c147f6601f73c58 
+ | hostId                               | f73013b029032929598a4a54586e4469c2c7cd676c147f6601f73c58
  ....
  
  error log in compute node:
  
  2014-06-10 11:54:48.372 ERROR nova.compute.manager [req-6a4ac25a-7d24-40c6-9f8d-435b4adb6fff admin admin] [instance: a1424990-182a
  -4bc2-8c17-aa4808a49472] Setting instance vm_state to ERROR
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] Traceback (most recent call la
- st):  
+ st):
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/c
  ompute/manager.py", line 5231, in _error_out_instance_on_exception
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     yield
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/c
  ompute/manager.py", line 3111, in prep_resize
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     filter_properties)
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/compute/manager.py", line 3104, in prep_resize
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     node)
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/compute/manager.py", line 3058, in _prep_resize
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     raise exception.MigrationError(msg)
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] MigrationError: destination same as source!
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]
- 
  
  bug reason:
  1. nova-scheduler is allowed to scheduler to compute node (due to controller nova.conf)
  
  2. but nova-compute is not allowed to resize in same host (due to
  compute node nova.conf)
  
  3.
  a)compute side _prep_resize() function set instance into error state:
  ....
  self._set_instance_error_state(context, instance['uuid'])
  ...
  then raise exception
  
  b)
  compute node reschedule the instance again, failed again
  ....
  self._reschedule_resize_or_reraise(context, image, instance,
-      exc_info, instance_type, reservations, request_spec,
-      filter_properties)
+      exc_info, instance_type, reservations, request_spec,
+      filter_properties)
  ...
  c)compute store instance fault info
  ....
  compute_utils.add_instance_fault_from_exc(context, self.conductor_api,
-     instance, exc_info[0], exc_info=exc_info)
- 
+     instance, exc_info[0], exc_info=exc_info)
  
  additional:
  no matter what the scheduler filter is using, instance should not be set to ERROR status just because scheduler doesn't find a appropriate host to do resize.
- and we can not deal with  vm in ERROR state unless we change it state in db
+ and we can not deal with  vm in ERROR state unless we change it's state in db

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1328367

Title:
  Do not set vm  error state when raise MigrationError

Status in OpenStack Compute (Nova):
  New

Bug description:
  Control Node: 101.0.0.20(also has compute service , but do not use it)
  Compute Node:  101.0.0.30

  nova version:
  2014.1.b2-847-ga891e04

  in control node nova.conf
  allow_resize_to_same_host = True
  and
  in compute node nova.conf
  allow_resize_to_same_host = False

  detail:
  1. boot an instance in compute node
  nova boot --image 51c4a908-c028-4ce2-bbd1-8b0e15d8d829 --flavor 84 --nic net-id=308840da-6440-4599-923a-2edd290971d3 --availability-zone nova:compute.localdomain migrate_test

  2. resize it to flavor type 1
  nova resize   migrate_test 1

  3.the instance has set to error state when resize failed.

  #nova list
  +--------------------------------------+----------+--------+-------------+-------------+-------------------+
  | a1424990-182a-4bc2-8c17-aa4808a49472 | migrate_test | ERROR  | resize_prep | Running     | private=20.0.0.15 |
  +--------------------------------------+----------+--------+-------------+-------------+-------------------+

  #nova show
  ....
  | config_drive                         |                                                                                                                                                               |
  | created                              | 2014-06-09T09:31:35Z                                                                                                                                          |
  | fault                                | {"message": "<class 'nova.exception.MigrationError'>", "code": 500, "details": "  File \"/opt/stack/nova/nova/compute/manager.py\", line 3104, in prep_resize |
  |                                      |     node)                                                                                                                                                     |
  |                                      |   File \"/opt/stack/nova/nova/compute/manager.py\", line 3058, in _prep_resize                                                                                |
  |                                      |     raise exception.MigrationError(msg)                                                                                                                       |
  |                                      | ", "created": "2014-06-10T03:54:39Z"}
                                                                       |
  | flavor                               | m1.micro (84)                                                                                                                                                 |
  | hostId                               | f73013b029032929598a4a54586e4469c2c7cd676c147f6601f73c58
  ....

  error log in compute node:

  2014-06-10 11:54:48.372 ERROR nova.compute.manager [req-6a4ac25a-7d24-40c6-9f8d-435b4adb6fff admin admin] [instance: a1424990-182a
  -4bc2-8c17-aa4808a49472] Setting instance vm_state to ERROR
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] Traceback (most recent call la
  st):
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/c
  ompute/manager.py", line 5231, in _error_out_instance_on_exception
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     yield
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/c
  ompute/manager.py", line 3111, in prep_resize
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     filter_properties)
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/compute/manager.py", line 3104, in prep_resize
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     node)
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]   File "/opt/stack/nova/nova/compute/manager.py", line 3058, in _prep_resize
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]     raise exception.MigrationError(msg)
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] MigrationError: destination same as source!
  2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]

  bug reason:
  1. nova-scheduler is allowed to scheduler to compute node (due to controller nova.conf)

  2. but nova-compute is not allowed to resize in same host (due to
  compute node nova.conf)

  3.
  a)compute side _prep_resize() function set instance into error state:
  ....
  self._set_instance_error_state(context, instance['uuid'])
  ...
  then raise exception

  b)
  compute node reschedule the instance again, failed again
  ....
  self._reschedule_resize_or_reraise(context, image, instance,
       exc_info, instance_type, reservations, request_spec,
       filter_properties)
  ...
  c)compute store instance fault info
  ....
  compute_utils.add_instance_fault_from_exc(context, self.conductor_api,
      instance, exc_info[0], exc_info=exc_info)

  additional:
  no matter what the scheduler filter is using, instance should not be set to ERROR status just because scheduler doesn't find a appropriate host to do resize.
  and we can not deal with  vm in ERROR state unless we change it's state in db

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1328367/+subscriptions


Follow ups

References