yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #60471
[Bug 1651704] Re: Errors when starting introspection are silently ignored
Reviewed: https://review.openstack.org/418423
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=c7b01eba55e5d133ccc19451cf4727170a5dbdd0
Submitter: Jenkins
Branch: master
commit c7b01eba55e5d133ccc19451cf4727170a5dbdd0
Author: Dougal Matthews <dougal@xxxxxxxxxx>
Date: Tue Jan 10 14:35:36 2017 +0000
Fail the baremetal workflows when sending a "FAILED" message
When Mistral workflows execute a second workflow (a sub-workflow
execution), the parent workflow can't easily determine if sub-workflow
failed. This is because the failure is communicated via a Zaqar message
only and when a workflow ends with a successful Zaqar message it appears
have been successful. This problem surfaces because parent workflows
should have an "on-error" attribute but it is never called, as the
workflow doesn't error.
This change marks the workflow as failed if the message has the status
"FAILED". Now when a sub-workflow fails, the task that called it should
have the on-error triggered. Previously it would always go to
on-success.
Closes-Bug: #1651704
Change-Id: I60444ec692351c44753649b59b7c1d7c4b61fa8e
** Changed in: tripleo
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1651704
Title:
Errors when starting introspection are silently ignored
Status in Ironic Inspector:
Incomplete
Status in OpenStack Compute (nova):
Invalid
Status in tripleo:
Fix Released
Status in ironic-inspector package in Ubuntu:
Invalid
Bug description:
Running tripleo using tripleo-quickstart with minimal profile
(step_introspect: true) for master branch, overcloud deploy with
error:
ResourceInError: resources.Controller: Went to status ERROR due to
"Message: No valid host was found. There are not enough hosts
available., Code: 500"
Looking at nova-scheduler.log, following errors are found:
https://ci.centos.org/artifacts/rdo/jenkins-tripleo-quickstart-
promote-master-delorean-minimal-806/undercloud/var/log/nova/nova-
scheduler.log.gz
2016-12-21 06:45:56.822 17759 DEBUG nova.scheduler.host_manager
[req-f889dbc0-1096-4f92-80fc-3c5bdcb1ad29
4f103e0230074c2488b7359bc079d323 f21dbfb3b2c840059ec2a0bba03b7385 - -
-] Update host state from compute node:
ComputeNode(cpu_allocation_ratio=16.0,cpu_info='',created_at=2016-12-21T06:38:28Z,current_workload=0,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=0,free_disk_gb=0,free_ram_mb=0,host='undercloud',host_ip=192.168.23.46,hypervisor_hostname
='c6f8f4ba-9c7c-4c87-b95a-
67a5861b7bec',hypervisor_type='ironic',hypervisor_version=1,id=2,local_gb=0,local_gb_used=0,memory_mb=0,memory_mb_used=0,metrics='[]',numa_topology=None,pci_device_pools=PciDevicePoolList,ram_allocation_ratio=1.0,running_vms=0,service_id=None,stats={boot_option='local',cpu_aes='true',cpu_arch='x86_64',cpu_hugepages='true',cpu_hugepages_1g='true',cpu_vt='true',profile='control'},supported_hv_specs=[HVSpec],updated_at=2016-12-21T06:45:38Z,uuid
=ac2742da-39fb-4ca4-9f78-8e04f703c7a6,vcpus=0,vcpus_used=0)
_locked_update /usr/lib/python2.7/site-
packages/nova/scheduler/host_manager.py:168
2016-12-21 06:47:48.893 17759 DEBUG
nova.scheduler.filters.ram_filter [req-2aece1c8-6d3e-457b-
92d7-a3177680c82e 4f103e0230074c2488b7359bc079d323
f21dbfb3b2c840059ec2a0bba03b7385 - - -] (undercloud, c6f8f4ba-9c7c-
4c87-b95a-67a5861b7bec) ram: 0MB disk: 0MB io_ops: 0 instances: 0 does
not have 8192 MB usable ram before overcommit, it only has 0 MB.
host_passes /usr/lib/python2.7/site-
packages/nova/scheduler/filters/ram_filter.py:45
2016-12-21 06:47:48.894 17759 INFO nova.filters [req-2aece1c8
-6d3e-457b-92d7-a3177680c82e 4f103e0230074c2488b7359bc079d323
f21dbfb3b2c840059ec2a0bba03b7385 - - -] Filter RamFilter returned 0
hosts
My guess is that node introspection is failing to get proper node
information.
Full logs can be found in https://ci.centos.org/artifacts/rdo/jenkins-
tripleo-quickstart-promote-master-delorean-minimal-806/undercloud/
We have hit this issue twice in the last runs.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic-inspector/+bug/1651704/+subscriptions