← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1840967] [NEW] nova-next job does not fail when 'nova-manage db purge' fails

 

Public bug reported:

Happened upon this while working on another patch to add more testing to
our post_test_hook.sh script, excerpt from the log [1]:

+ /usr/local/bin/nova-manage db purge --all --verbose --all-cells
+ RET=3
+ [[ 3 -eq 0 ]]
+ echo Purge failed with result 3
Purge failed with result 3
+ return 3
+ set -e
+ set +x
WARNING: setting legacy OS_TENANT_NAME to support cli tools.
+ /opt/stack/nova/gate/post_test_hook.sh:main:54 :   echo 'Verifying that instances were archived from all cells'
Verifying that instances were archived from all cells
++ /opt/stack/nova/gate/post_test_hook.sh:main:55 :   openstack server list --deleted --all-projects -c ID -f value
+ /opt/stack/nova/gate/post_test_hook.sh:main:55 :   deleted_servers='e4727a33-796e-4173-b369-24d7ee45d7fd
b213a354-0830-4cc3-abf7-e9dd068cefa9
33569d93-d7b6-4a92-825e-f36e972722db
521e4a84-c313-433e-8cc7-6d66c821d78c

Because of a bug in my WIP patch, the purge command failed, but the job
continued to run and didn't fail at that point because the 'nova-manage
db purge' command comes before the 'set -e' command [that makes the
script exit with any non-zero return value].

So, we need to move the purge command after 'set -e'. Note that we
should *not* move the archive command though, because during its
intermediate runs, it is expected to return 1, and we don't want to fail
the job when that happens. The archive_deleted_rows function does its
own explicit exiting in the case of actual failures.

[1] https://object-storage-ca-
ymq-1.vexxhost.net/v1/86bbbcfa8ad043109d2d7af530225c72/logs_40/672840/8/check
/nova-next/9d13cfb/ara-report/result/d13f888f-d187-4c3b-b5ab-
9326f611e534/

** Affects: nova
     Importance: Undecided
     Assignee: melanie witt (melwitt)
         Status: New


** Tags: testing

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1840967

Title:
  nova-next job does not fail when 'nova-manage db purge' fails

Status in OpenStack Compute (nova):
  New

Bug description:
  Happened upon this while working on another patch to add more testing
  to our post_test_hook.sh script, excerpt from the log [1]:

  + /usr/local/bin/nova-manage db purge --all --verbose --all-cells
  + RET=3
  + [[ 3 -eq 0 ]]
  + echo Purge failed with result 3
  Purge failed with result 3
  + return 3
  + set -e
  + set +x
  WARNING: setting legacy OS_TENANT_NAME to support cli tools.
  + /opt/stack/nova/gate/post_test_hook.sh:main:54 :   echo 'Verifying that instances were archived from all cells'
  Verifying that instances were archived from all cells
  ++ /opt/stack/nova/gate/post_test_hook.sh:main:55 :   openstack server list --deleted --all-projects -c ID -f value
  + /opt/stack/nova/gate/post_test_hook.sh:main:55 :   deleted_servers='e4727a33-796e-4173-b369-24d7ee45d7fd
  b213a354-0830-4cc3-abf7-e9dd068cefa9
  33569d93-d7b6-4a92-825e-f36e972722db
  521e4a84-c313-433e-8cc7-6d66c821d78c

  Because of a bug in my WIP patch, the purge command failed, but the
  job continued to run and didn't fail at that point because the 'nova-
  manage db purge' command comes before the 'set -e' command [that makes
  the script exit with any non-zero return value].

  So, we need to move the purge command after 'set -e'. Note that we
  should *not* move the archive command though, because during its
  intermediate runs, it is expected to return 1, and we don't want to
  fail the job when that happens. The archive_deleted_rows function does
  its own explicit exiting in the case of actual failures.

  [1] https://object-storage-ca-
  ymq-1.vexxhost.net/v1/86bbbcfa8ad043109d2d7af530225c72/logs_40/672840/8/check
  /nova-next/9d13cfb/ara-report/result/d13f888f-d187-4c3b-b5ab-
  9326f611e534/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1840967/+subscriptions


Follow ups