← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1809123] Re: OSError failure to read when creating multiple instances with NFS

 

Reviewed:  https://review.openstack.org/625741
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=525631d8dc058910728e55def616358b0e7f2f69
Submitter: Zuul
Branch:    master

commit 525631d8dc058910728e55def616358b0e7f2f69
Author: Tim Rozet <trozet@xxxxxxxxxx>
Date:   Mon Dec 17 19:44:54 2018 -0500

    Fixes race condition with privsep utime
    
    There is a race condition that occurs over NFS when multiple instances
    are being created where utime fails, due to some other process
    modifying the file path. This patch ensures the path is created and
    is readable before attempting to modify with utime.
    
    Closes-Bug: 1809123
    
    Change-Id: Id68aa27a8ab08d9c00655e5ed6b48d194aa8e6f6
    Signed-off-by: Tim Rozet <trozet@xxxxxxxxxx>


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1809123

Title:
  OSError failure to read when creating multiple instances with NFS

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  There is a race condition in when launching multiple instances over
  NFS simultaneously that can end up causing the os.utime function to
  fail when updating the mtime for the image base:

  2018-12-15 14:22:38.740 7 INFO nova.virt.libvirt.driver [req-d33edf35-733b-4591-831c-666cd159cee1 8965b22a11c44875a90fe88f50769a5a b9644067db0d44789e19d9d032287ada - default default] [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552] Creating image
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [req-d33edf35-733b-4591-831c-666cd159cee1 8965b22a11c44875a90fe88f50769a5a b9644067db0d44789e19d9d032287ada - default default] [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552] Instance failed to spawn: OSError: [Errno 13] Permission denied
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552] Traceback (most recent call last):
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2252, in _build_resources
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     yield resources
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2032, in _build_and_run_instance
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     block_device_info=block_device_info)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3091, in spawn
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     block_device_info=block_device_info)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3469, in _create_image
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     fallback_from_host)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3560, in _create_and_inject_local_root
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     instance, size, fallback_from_host)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7634, in _try_fetch_image_cache
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     size=size)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py", line 243, in cache
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     *args, **kwargs)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py", line 601, in create_image
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     nova.privsep.path.utime(base)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     return self.channel.remote_call(name, args, kwargs)
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]   File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552]     raise exc_type(*result[2])
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552] OSError: [Errno 13] Permission denied
  2018-12-15 14:22:38.747 7 ERROR nova.compute.manager [instance: 6fec5d88-09ab-4ecc-815d-c08c298fe552] 

  
  The NFS settings are correct:
  [root@overcloud-novacompute-0 ~]# mount | grep nova
  overcloud-controller-0.opnfvlf.org:/nova on /var/lib/nova/instances type nfs4 (rw,relatime,sync,context=system_u:object_r:nfs_t:s0,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,nosharecache,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.0.2.44,local_lock=none,addr=192.0.2.40,_netdev)

  
  [root@overcloud-novacompute-0 ~]# getenforce
  Permissive

  [root@overcloud-novacompute-0 ~]# ls -lrt /var/lib/nova/
  total 0
  drwxrwxrwx. 4 42436 42436 53 Dec 14 19:52 instances

  [root@overcloud-novacompute-0 ~]# ls -lrt /var/lib/nova/instances/
  total 4
  drwxr-xr-x. 2 42436 42436 54 Dec 14 19:16 _base
  drwxr-xr-x. 2 42436 42436 93 Dec 14 19:52 locks
  -rw-r--r--. 1 42436 42436 58 Dec 14 20:33 compute_nodes

  [root@overcloud-novacompute-0 ~]# ls -lrt /var/lib/nova/instances/_base/
  total 13384
  -rw-r--r--. 1 42427 42427 46137344 Dec 14 19:18 78c6ae575fd026484b1f46588b5c337b3fd53558

  [root@overcloud-novacompute-0 ~]# docker exec -it bf4658c561e1 /bin/bash
  ()[nova@overcloud-novacompute-0 /]$ id -u nova
  42436
  ()[nova@overcloud-novacompute-0 /]$ id -u qemu
  42427

  ()[nova@overcloud-novacompute-0 /]$ ls -lrt /var/lib/nova/instances/
  total 4
  drwxr-xr-x. 2 nova nova 54 Dec 14 19:16 _base
  drwxr-xr-x. 2 nova nova 93 Dec 14 19:52 locks
  -rw-r--r--. 1 nova nova 58 Dec 14 20:33 compute_nodes
  ()[nova@overcloud-novacompute-0 /]$ ls -lrt /var/lib/nova/instances/_base/
  total 13384
  -rw-r--r--. 1 qemu qemu 46137344 Dec 14 19:18 78c6ae575fd026484b1f46588b5c337b3fd53558

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1809123/+subscriptions


References