← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1620341] [NEW] Removing unused base images removes backing files of active instances

 

Public bug reported:

I've been experiencing dangerous issue that my backing files located on shared storage in _base folder are being removed by nova-compute. It's being happen on Juno, Kilo and Liberty releases. The shared storage mount /var/lib/nova/instances are configured on NFSv3. Backing image ids exists in /var/lib/nova/instances/locks/ folder for affected files. I don't know for sure, how the mechanism preventing _base files from deletion works - if it depends on locks folder or if it depends on locking files on shared storage, but from my point of view this is bug by design and the mechanism should be redesigned to not rely on client which is actually compute node. It causes many impacts on stability and security of users data!
I want to ask for considering some new cleaning system, because current cleaning worker is designed for indepenent compute nodes without shared storage and it looks like it was not well adapted for configurations with shared storage. Maybe developers should consider some central mechanism and fetching data about used and unused _base files from database, not relying what is running on not on compute node locally.
I can't reproduce this problem anymore because I had to disable cleaning unused base images and deploy own, secure worker.

** Affects: nova
     Importance: Undecided
         Status: New

** Description changed:

  I've been experiencing dangerous issue that my backing files located on shared storage in _base folder are being removed by nova-compute. It's being happen on Juno, Kilo and Liberty releases. The shared storage mount /var/lib/nova/instances are configured on NFSv3. Backing image ids exists in /var/lib/nova/instances/locks/ folder for affected files. I don't know for sure, how the mechanism preventing _base files from deletion works - if it depends on locks folder or if it depends on locking files on shared storage, but from my point of view this is bug by design and the mechanism should be redesigned to not rely on client which is actually compute node. It causes many impacts on stability and security of users data!
- I want to ask for considering some new cleaning system, because current cleaning worker is designed for indepenent compute nodes without shared storage and it looks like it was not well adapted to configurations with shared storage. Maybe developers should consider some central mechanism and fetching data about used and unused _base files from database, not relying what is running on not on compute node locally.
+ I want to ask for considering some new cleaning system, because current cleaning worker is designed for indepenent compute nodes without shared storage and it looks like it was not well adapted for configurations with shared storage. Maybe developers should consider some central mechanism and fetching data about used and unused _base files from database, not relying what is running on not on compute node locally.
  I can't reproduce this problem anymore because I had to disable cleaning unused base images and deploy own, secure worker.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1620341

Title:
  Removing unused base images removes backing files of active instances

Status in OpenStack Compute (nova):
  New

Bug description:
  I've been experiencing dangerous issue that my backing files located on shared storage in _base folder are being removed by nova-compute. It's being happen on Juno, Kilo and Liberty releases. The shared storage mount /var/lib/nova/instances are configured on NFSv3. Backing image ids exists in /var/lib/nova/instances/locks/ folder for affected files. I don't know for sure, how the mechanism preventing _base files from deletion works - if it depends on locks folder or if it depends on locking files on shared storage, but from my point of view this is bug by design and the mechanism should be redesigned to not rely on client which is actually compute node. It causes many impacts on stability and security of users data!
  I want to ask for considering some new cleaning system, because current cleaning worker is designed for indepenent compute nodes without shared storage and it looks like it was not well adapted for configurations with shared storage. Maybe developers should consider some central mechanism and fetching data about used and unused _base files from database, not relying what is running on not on compute node locally.
  I can't reproduce this problem anymore because I had to disable cleaning unused base images and deploy own, secure worker.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1620341/+subscriptions


Follow ups