← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1872082] [NEW] available disk on compute may be lightly overestimated in some cases

 

Public bug reported:

Description
===========

Calculation of available disk space on compute host can be a bit inaccurate from few KB to few GB,
involving possible bad scheduler decision.

availability disk for new instance on a specific host  is calculated this way:
available_disk_least = disk_free_fs  - over_committed_disk_size (aka:  sum of instances disk reservation not yet used)

But because over_committed_disk_size can be negative (see below)
available_least space can be mistakenly increased.

Steps to reproduce
==================

on master devstack

raw instances
-----
if you spawn instances with raw preallocated disk
1) set in nova-cpu.conf:
[default]/preallocate_images=space
 [libvirt]/images_type=raw
2) restart nova compute
sudo service devstack@n-cpu restart

3) spawn 3 instances : openstack server create  --flavor m1.large --image cirros-0.4.0-x86_64-disk --nic net-id=private  alex
devstack$ ls -lhs /opt/stack/data/nova/instances/*/disk
81G -rw-r--r-- 1 libvirt-qemu kvm 80G Apr 10 00:49 /opt/stack/data/nova/instances/6ce8d602-e3b4-433b-92dc-57508dd86163/disk
81G -rw-r--r-- 1 libvirt-qemu kvm 80G Apr 10 06:54 /opt/stack/data/nova/instances/71c4867d-6e13-4f16-a9f7-f5da28388346/disk
81G -rw-r--r-- 1 libvirt-qemu kvm 80G Apr  9 15:28 /opt/stack/data/nova/instances/aaad32cb-a437-46c1-9122-2d0197acf54a/disk
devstack$  ls -ls /opt/stack/data/nova/instances/*/disk | awk '{sum_alloc_size+=$1 ; sum_virtual_size+=$6}END{print sum_virtual_size-sum_alloc_size*1024" bytes"}'
-2330624 bytes


qcow2 instances
-----
if you do the same with qcow2 with (in nova-cpu.conf: [default]/preallocate_images=space [libvirt]/images_type=qcow2):
qemu-img info /opt/stack/data/nova/instances/d99ea46d-95f6-4078-8d42-984506aa9d10/disk --output=json --force-share | grep -e actual-size -e virtual-size
	"virtual-size": 85899345920,
	"actual-size": 85899362304,
-16384 bytes

But main issue is because actual size can be greater than virtual size of few percents due to qcow2 metadata overhead.
real case:
qemu-img info /instances/29e86867-ec72-41fb-8dde-0df663c13ff8/disk
image: /instances/29e86867-ec72-41fb-8dde-0df663c13ff8/disk
file format: qcow2
virtual size: 80G (85899345920 bytes)
disk size: 85G
cluster_size: 65536
backing file: /instances/_base/68429e45667b8f2b91a94a76170cdb94f4adc9f2
Format specific information:
	compat: 1.1
	lazy refcounts: false
	refcount bits: 16
	corrupt: false

-> -5 GB for only one instance


Expected result
===============
over_committed_disk_size calculation should retain 0 in case of negative value

Actual result
===============
negative over_committed_disk_size artificially increase available_disk_least

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1872082

Title:
  available disk on compute may be lightly overestimated in some cases

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  Calculation of available disk space on compute host can be a bit inaccurate from few KB to few GB,
  involving possible bad scheduler decision.

  availability disk for new instance on a specific host  is calculated this way:
  available_disk_least = disk_free_fs  - over_committed_disk_size (aka:  sum of instances disk reservation not yet used)

  But because over_committed_disk_size can be negative (see below)
  available_least space can be mistakenly increased.

  Steps to reproduce
  ==================

  on master devstack

  raw instances
  -----
  if you spawn instances with raw preallocated disk
  1) set in nova-cpu.conf:
  [default]/preallocate_images=space
   [libvirt]/images_type=raw
  2) restart nova compute
  sudo service devstack@n-cpu restart

  3) spawn 3 instances : openstack server create  --flavor m1.large --image cirros-0.4.0-x86_64-disk --nic net-id=private  alex
  devstack$ ls -lhs /opt/stack/data/nova/instances/*/disk
  81G -rw-r--r-- 1 libvirt-qemu kvm 80G Apr 10 00:49 /opt/stack/data/nova/instances/6ce8d602-e3b4-433b-92dc-57508dd86163/disk
  81G -rw-r--r-- 1 libvirt-qemu kvm 80G Apr 10 06:54 /opt/stack/data/nova/instances/71c4867d-6e13-4f16-a9f7-f5da28388346/disk
  81G -rw-r--r-- 1 libvirt-qemu kvm 80G Apr  9 15:28 /opt/stack/data/nova/instances/aaad32cb-a437-46c1-9122-2d0197acf54a/disk
  devstack$  ls -ls /opt/stack/data/nova/instances/*/disk | awk '{sum_alloc_size+=$1 ; sum_virtual_size+=$6}END{print sum_virtual_size-sum_alloc_size*1024" bytes"}'
  -2330624 bytes

  
  qcow2 instances
  -----
  if you do the same with qcow2 with (in nova-cpu.conf: [default]/preallocate_images=space [libvirt]/images_type=qcow2):
  qemu-img info /opt/stack/data/nova/instances/d99ea46d-95f6-4078-8d42-984506aa9d10/disk --output=json --force-share | grep -e actual-size -e virtual-size
  	"virtual-size": 85899345920,
  	"actual-size": 85899362304,
  -16384 bytes

  But main issue is because actual size can be greater than virtual size of few percents due to qcow2 metadata overhead.
  real case:
  qemu-img info /instances/29e86867-ec72-41fb-8dde-0df663c13ff8/disk
  image: /instances/29e86867-ec72-41fb-8dde-0df663c13ff8/disk
  file format: qcow2
  virtual size: 80G (85899345920 bytes)
  disk size: 85G
  cluster_size: 65536
  backing file: /instances/_base/68429e45667b8f2b91a94a76170cdb94f4adc9f2
  Format specific information:
  	compat: 1.1
  	lazy refcounts: false
  	refcount bits: 16
  	corrupt: false

  -> -5 GB for only one instance

  
  Expected result
  ===============
  over_committed_disk_size calculation should retain 0 in case of negative value

  Actual result
  ===============
  negative over_committed_disk_size artificially increase available_disk_least

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1872082/+subscriptions


Follow ups