← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1730556] [NEW] all periodic_task died due to uncaught exceptions

 

Public bug reported:

Description
===========
all periodic tasks of nova-compute don't work in environment of our customers
report_state works.

Steps to reproduce
==================
after some error of our virtulization product happened.
At that time, get_available_nodes of the driver can not fulfil and throw a exception that does nto inherit from Excetipn

Analysis
=============
I found that the coroutine has disappear and run_periodic_tasks in periodic_task.py does not catch BaseException

Solution
=============
run_periodic_tasks should catch all exception.
This is a major problem. All periodic_task died even those tasks which work well and don't throw exception.
Because they belong to a same coroutine.
We should keep nova-compute robust.

Environment
===========
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/
   openstack-nova-common-16.0.1-1.el7.noarch
   python2-novaclient-9.1.0-1.noarch
   python-nova-16.0.1-1.el7.noarch
   openstack-nova-compute-16.0.1-1.noarch

2. Which hypervisor did you use?
   Our own virtulization product

2. Which storage type did you use?
   None

3. Which networking type did you use?
   Neutron with our sdn

** Affects: oslo.service
     Importance: Undecided
     Assignee: xhzhf (guoyongxhzhf)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => xhzhf (guoyongxhzhf)

** Project changed: nova => oslo.service

** Description changed:

  Description
  ===========
- all periodic tasks of nova-compute don't work
+ all periodic tasks of nova-compute don't work in environment of our customers
  report_state works.
  
  Steps to reproduce
  ==================
  after some error of our virtulization product happened.
  At that time, get_available_nodes of the driver can not fulfil and throw a exception that does nto inherit from Excetipn
- 
- Expected result
- ===============
- all periodic task should keep going
- 
- Actual result
- =============
- all periodic tasks stop
  
  Analysis
  =============
  I found that the coroutine has disappear and run_periodic_tasks in periodic_task.py does not catch BaseException
  
  Solution
  =============
  run_periodic_tasks should catch all exception.
  This is a major problem. All periodic_task died even those tasks which work well and don't throw exception.
  Because they belong to a same coroutine.
  We should keep nova-compute robust.
  
  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
-   list for all releases: http://docs.openstack.org/releases/
-    openstack-nova-common-16.0.1-1.el7.noarch
-    python2-novaclient-9.1.0-1.noarch
-    python-nova-16.0.1-1.el7.noarch
-    openstack-nova-compute-16.0.1-1.noarch
+   list for all releases: http://docs.openstack.org/releases/
+    openstack-nova-common-16.0.1-1.el7.noarch
+    python2-novaclient-9.1.0-1.noarch
+    python-nova-16.0.1-1.el7.noarch
+    openstack-nova-compute-16.0.1-1.noarch
  
  2. Which hypervisor did you use?
-    Our own virtulization product
+    Our own virtulization product
  
  2. Which storage type did you use?
-    None
+    None
  
  3. Which networking type did you use?
-    Neutron with our sdn
+    Neutron with our sdn

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1730556

Title:
  all periodic_task died due to uncaught exceptions

Status in oslo.service:
  New

Bug description:
  Description
  ===========
  all periodic tasks of nova-compute don't work in environment of our customers
  report_state works.

  Steps to reproduce
  ==================
  after some error of our virtulization product happened.
  At that time, get_available_nodes of the driver can not fulfil and throw a exception that does nto inherit from Excetipn

  Analysis
  =============
  I found that the coroutine has disappear and run_periodic_tasks in periodic_task.py does not catch BaseException

  Solution
  =============
  run_periodic_tasks should catch all exception.
  This is a major problem. All periodic_task died even those tasks which work well and don't throw exception.
  Because they belong to a same coroutine.
  We should keep nova-compute robust.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/
     openstack-nova-common-16.0.1-1.el7.noarch
     python2-novaclient-9.1.0-1.noarch
     python-nova-16.0.1-1.el7.noarch
     openstack-nova-compute-16.0.1-1.noarch

  2. Which hypervisor did you use?
     Our own virtulization product

  2. Which storage type did you use?
     None

  3. Which networking type did you use?
     Neutron with our sdn

To manage notifications about this bug go to:
https://bugs.launchpad.net/oslo.service/+bug/1730556/+subscriptions