yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1272076] Re: VolumeNotCreated - Instance failed, cinder too slow with

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Sean Dague <sean@xxxxxxxxx>
Date: Fri, 09 Dec 2016 12:56:02 -0000
Reply-to: Bug 1272076 <1272076@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Just making Nova wait longer doesn't seem to be a good strategy, because
then we just end up in a different deadlock scenario of waiting with no
idea when things are finished.

I think that for long provisioning situations like this we just need to
expect people to preprovision the volumes first from cinder. Nova will
never be good enough at orchestration to solve this issue generically.

Marking as Won't Fix. This isn't a simple bug, this is going to require
at least one spec to change the way flows work.

** Changed in: nova
Status: New => Won't Fix

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1272076

Title:
VolumeNotCreated - Instance failed, cinder too slow with

Status in Cinder:
Won't Fix
Status in OpenStack Compute (nova):
Won't Fix

Bug description:
Hi,

I've found that under certain circumstances cinder does not create
volumes fast enough.

I can launch an image from a new volume from image with 4GB. I use LVM
to allocate space. After a while I found that the instance didn't
worked.

Looking at logs I can find:

2014-01-23 21:44:15.337 2398 TRACE nova.compute.manager [instance: a0e35767-424e-434d-99b4-35e19422054f] attempts=attempts)
2014-01-23 21:44:15.337 2398 TRACE nova.compute.manager [instance: a0e35767-424e-434d-99b4-35e19422054f] VolumeNotCreated: Volume 137bc77b-c9e6-47ba-b2f
5-c83f440a988b did not finish being created even after we waited 66 seconds or 60 attempts.

I was looking around and the cinder was "downloading". I think it was taking the image from the image server and building the volume. I don't know why it took so long since installation is gigabit ethernet and even more, the image is in a instance launched on the cinder hardware machine. So it does not even any networking. All resolves internally.

Image is saucy (About 300MB).

The problem is that after a while volume creation finished and
instance failed. So I recereated instance and made it work from volume
with no problems.

How should I track where the processing slows down?

I know that iscsi attachment is slow. One of possible point of
faillure is when you have iscsi target that are in a machine that's
not reachable. This slows down the rest of processing but I'm not sure
if this is a point here.

Anyway. I'm sure hardware is not the best but pretty decent. Raid1
array with WD black label. Good sata controller and Intel gigabit
network cards. So disk should not be the problem. I'm thinking about
networking/config related problem.

But I'm lost on this.

Any help.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1272076/+subscriptions

References

[Bug 1272076] [NEW] VolumeNotCreated - Instance failed, cinder too slow
From: gadLinux, 2014-01-23