← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1774527] [NEW] Too many errors can trigger compute failed_builds to get incremented

 

*** This bug is a security vulnerability ***

Private security bug reported:

So let's analyze what can cause a compute managers failed_builds to get
incremented and point out that some of them should not be causing
failed_builds to get incremented (which then can have the 'nice' effect
of auto-disabling a nova-compute service).

So the return code of self._do_build_and_run_instance returns a result
code; the catch of all exceptions also triggers the setting of a result
code to failed; when this is failed it will cause the failed_build
counter to get incremented.

Some unrelated to nova-compute exceptions that from reading the code can
trigger this to happen:

- Unable to base64 decode injected files.
- Failure of notify_about_instance_create to actually send (some inner exception perhaps?)
- exception.NoMoreNetworks, exception.NoMoreFixedIps
- exception.FlavorDiskTooSmall, exception.FlavorMemoryTooSmall,
  exception.ImageNotActive, exception.ImageUnacceptable,
  exception.InvalidDiskInfo, exception.InvalidDiskFormat,
  cursive_exception.SignatureVerificationError,
  exception.VolumeEncryptionNotSupported, exception.InvalidInput,
  exception.RequestedVRamTooHigh --- these bubble up as BuildAbortException
- exception.InstanceNotFound, exception.UnexpectedDeletingTaskStateError
- Anything that pops out of _build_resources
   - Failed to allocate network

And many more?

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774527

Title:
  Too many errors can trigger compute failed_builds to get incremented

Status in OpenStack Compute (nova):
  New

Bug description:
  So let's analyze what can cause a compute managers failed_builds to
  get incremented and point out that some of them should not be causing
  failed_builds to get incremented (which then can have the 'nice'
  effect of auto-disabling a nova-compute service).

  So the return code of self._do_build_and_run_instance returns a result
  code; the catch of all exceptions also triggers the setting of a
  result code to failed; when this is failed it will cause the
  failed_build counter to get incremented.

  Some unrelated to nova-compute exceptions that from reading the code
  can trigger this to happen:

  - Unable to base64 decode injected files.
  - Failure of notify_about_instance_create to actually send (some inner exception perhaps?)
  - exception.NoMoreNetworks, exception.NoMoreFixedIps
  - exception.FlavorDiskTooSmall, exception.FlavorMemoryTooSmall,
    exception.ImageNotActive, exception.ImageUnacceptable,
    exception.InvalidDiskInfo, exception.InvalidDiskFormat,
    cursive_exception.SignatureVerificationError,
    exception.VolumeEncryptionNotSupported, exception.InvalidInput,
    exception.RequestedVRamTooHigh --- these bubble up as BuildAbortException
  - exception.InstanceNotFound, exception.UnexpectedDeletingTaskStateError
  - Anything that pops out of _build_resources
     - Failed to allocate network

  And many more?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774527/+subscriptions


Follow ups