yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #62041
[Bug 1670319] [NEW] Reschedule of failed instance doesn't happening when scheduler placed two instances to the same ironic node
Public bug reported:
There is a known bug https://bugs.launchpad.net/tripleo/+bug/1341420 that is caused by nova scheduling/claim resources design. In two words scheduler may schedule different instances to the same ironic node and second instance will alway fail as claim of resources is done on nova-compute side.
There should be a reschedule for second instance once it is failed, but it doesn't happening.
1. Fisrt instance is placed on c4d5e326-7ad3-4c25-bfe5-3cab211a723e
http://logs.openstack.org/71/441271/1/gate/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial/b0037ad/logs/screen-n-sch.txt.gz#_2017-03-06_09_08_08_343
2017-03-06 09:08:08.343 20337 DEBUG nova.scheduler.filter_scheduler
[req-d7c167ea-4bd9-40fe-bfa6-452695a40fa9 tempest-
ServersTestJSON-207710543 tempest-ServersTestJSON-207710543] Selected
host: WeighedHost [host: (ubuntu-xenial-2-node-osic-
cloud1-s3500-7711232-456798, c4d5e326-7ad3-4c25-bfe5-3cab211a723e) ram:
384MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0] _schedule
/opt/stack/new/nova/nova/scheduler/filter_scheduler.py:126
2. Second instance is placed on c4d5e326-7ad3-4c25-bfe5-3cab211a723e
http://logs.openstack.org/71/441271/1/gate/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial/b0037ad/logs/screen-n-sch.txt.gz#_2017-03-06_09_08_08_421
2017-03-06 09:08:08.421 20337 DEBUG nova.scheduler.filter_scheduler [req-f903ab7f-7525-4567-82f7-8bf2f2b53c86 tempest-ServerActionsTestJSON-1730451988 tempest-ServerActionsTestJSON-1730451988] Selected host: WeighedHost [host: (ubuntu-xenial-2-node-osic-cloud1-s3500-7711232-456798, c4d5e326-7ad3-4c25-bfe5-3cab211a723e) ram: 384MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0] _schedule
3. nova-compute doesn't reschedule failed instance
http://logs.openstack.org/71/441271/1/gate/gate-tempest-dsvm-ironic-ipa-
wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-
xenial/b0037ad/logs/subnode-2/screen-n-cpu.txt.gz#_2017-03-06_09_08_09_137
2017-03-06 09:08:09.137 31801 DEBUG nova.compute.manager [req-
f903ab7f-7525-4567-82f7-8bf2f2b53c86 tempest-
ServerActionsTestJSON-1730451988 tempest-
ServerActionsTestJSON-1730451988] [instance:
bef43a32-f310-4ef4-8264-c7bc064856b1] Retry info not present, will not
reschedule _do_build_and_run_instance
/opt/stack/new/nova/nova/compute/manager.py:1788
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1670319
Title:
Reschedule of failed instance doesn't happening when scheduler placed
two instances to the same ironic node
Status in OpenStack Compute (nova):
New
Bug description:
There is a known bug https://bugs.launchpad.net/tripleo/+bug/1341420 that is caused by nova scheduling/claim resources design. In two words scheduler may schedule different instances to the same ironic node and second instance will alway fail as claim of resources is done on nova-compute side.
There should be a reschedule for second instance once it is failed, but it doesn't happening.
1. Fisrt instance is placed on c4d5e326-7ad3-4c25-bfe5-3cab211a723e
http://logs.openstack.org/71/441271/1/gate/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial/b0037ad/logs/screen-n-sch.txt.gz#_2017-03-06_09_08_08_343
2017-03-06 09:08:08.343 20337 DEBUG nova.scheduler.filter_scheduler
[req-d7c167ea-4bd9-40fe-bfa6-452695a40fa9 tempest-
ServersTestJSON-207710543 tempest-ServersTestJSON-207710543] Selected
host: WeighedHost [host: (ubuntu-xenial-2-node-osic-
cloud1-s3500-7711232-456798, c4d5e326-7ad3-4c25-bfe5-3cab211a723e)
ram: 384MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0]
_schedule /opt/stack/new/nova/nova/scheduler/filter_scheduler.py:126
2. Second instance is placed on c4d5e326-7ad3-4c25-bfe5-3cab211a723e
http://logs.openstack.org/71/441271/1/gate/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial/b0037ad/logs/screen-n-sch.txt.gz#_2017-03-06_09_08_08_421
2017-03-06 09:08:08.421 20337 DEBUG nova.scheduler.filter_scheduler [req-f903ab7f-7525-4567-82f7-8bf2f2b53c86 tempest-ServerActionsTestJSON-1730451988 tempest-ServerActionsTestJSON-1730451988] Selected host: WeighedHost [host: (ubuntu-xenial-2-node-osic-cloud1-s3500-7711232-456798, c4d5e326-7ad3-4c25-bfe5-3cab211a723e) ram: 384MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0] _schedule
3. nova-compute doesn't reschedule failed instance
http://logs.openstack.org/71/441271/1/gate/gate-tempest-dsvm-ironic-
ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-
xenial/b0037ad/logs/subnode-2/screen-n-cpu.txt.gz#_2017-03-06_09_08_09_137
2017-03-06 09:08:09.137 31801 DEBUG nova.compute.manager [req-
f903ab7f-7525-4567-82f7-8bf2f2b53c86 tempest-
ServerActionsTestJSON-1730451988 tempest-
ServerActionsTestJSON-1730451988] [instance:
bef43a32-f310-4ef4-8264-c7bc064856b1] Retry info not present, will not
reschedule _do_build_and_run_instance
/opt/stack/new/nova/nova/compute/manager.py:1788
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1670319/+subscriptions