← Back to team overview

openstack team mailing list archive

Re: [NOVA] Snapshotting may require significant disk space (in /tmp). How to properly solve disk space issues?

 

On Mar 16, 2012, at 7:51 PM, Pádraig Brady wrote:

> On 03/16/2012 11:57 PM, Justin Shepherd wrote:
>> 
>> 
>> On Mar 16, 2012, at 12:26, "Pádraig Brady" <P@xxxxxxxxxxxxxx> wrote:
>> 
>>> On 03/16/2012 04:11 PM, Jay Pipes wrote:
>>>> Hi Stackers,
>>>> 
>>>> So, in diagnosing a few things on TryStack yesterday, I ran into an interesting problem with snapshotting that I'm hoping to get some advice on.
>>>> 
>>>> == The Problem ==
>>>> 
>>> 
>>>> QEMU was unhelpfully returning a vague error message of "error while writing".
>>> 
>>> That could be improved.
>>> As an aside, since qemu-img is mainly dealing with large files,
>>> it would be a prime candidate to call fallocate() from
>>> to get good layout for the files and immediate feedback
>>> if there isn't enough space.
>>> 
>>> On a related note, I've a patch pending for after RC1
>>> that should auto clean any of these partially written files:
>>> https://review.openstack.org/#change,5442
>>> 
>>>> As it turns out, the base operating system we install on our compute nodes in TryStack has a (very) small root partition
>>> 
>>>> == Possible Solutions ==
>>>> 
>>>> So, there are a number of solutions that we can work on here, and I'm wondering what the preference would be. Here are the solutions I have come up with, along with a no-brainer improvement to Nova that would help in diagnosing this problem:
>>>> 
>>>> The no-brainer: Detect before attempting a snapshot that there is enough space on a device to perform the operation, and if not, throw a useful error message up the stack
>>> 
>>> The space can change while writing, so you could still get the same error above.
>>> 
>>>> 
>>>> Solutions to the disk space problem:
>>>> 
>>>> (1) Silly Jay, change the damn size of the root partition in your PXE base OS install!
>>>> 
>>>> Now, I'm no expert in creating customized base disk images, but from looking at the build_pxe_env.sh script in devstack [1], it seems pretty trivial to change the ramdisk_size parameter in the startup options to something larger than 2109600. We could do this and reimage the compute nodes one by one.
>>>> 
>>>> (2) Make the location in which the snapshot is made configurable.
>>>> 
>>>> Right now, as mentioned above, tempfile.mkdtemp() is used, which creates a directory in the user's TMPDIR (typically /tmp, which is usually on the root partition).
>>>> 
>>>> We could add an option (--libvirt-snapshot-dir?) that would allow nova-compute to override where that snapshot is built.
>>>> 
>>>> (3) Change the user (running nova-compute) TMPDIR setting to something different than /tmp on the root partition).
>>> 
>>> I'd lean towards (3).
>>> That's something that depends on the environment (as you've nicely demonstrated),
>>> and also for security reasons the admin should be able to set TMPDIR.
>>> That's the standard way to do it, and it works already (hopefully).
>> 
>> Actually I would argue that the best way to accomplish this would be option #2. That way an admin/operator has control over the location. Not manipulating this by messing around with a users environment variable.
> 
> Well one can set the TMPDIR in the init script for the service.
> That's a fairly standard mechanism.

While it is fairly standard practice.. it makes me cry a little inside every time i have to start adding ENV vars to an init script because of a hard coded value that was not exposed as a configuration option.

My $0.02 as an ops guy.

> 
> (2) is good though if you would ever want to separate
> --libvirt-snapshot-dir from, $TMPDIR
> 
> Now I can definitely see the need for changing TMPDIR from /tmp
> for Jay's reasons and /tmp being tmpfs by default on debian for example:
> http://lists.debian.org/debian-devel/2011/11/msg00281.html
> I'm not sure if you'd need to separate them?
> Though I'm always biased towards avoiding new config variables.
> I suppose one could argue you might want /tmp for small fast accesses,
> and something large and separate for manipulating large files.
> 
> Now that I look at the existing nova uses of tmp dirs
> to store/stage large images, I see existing config vars:
> 
> FLAGS.xenapi_sr_base_path  # xens default Storage Repo
> FLAGS.image_decryption_dir # nova/image/s3.py
> 
> So if you were following that you would implement (2) with:
> 
> FLAGS.libvirt_snapshot_dir
> 
> There might be opportunity to merge all three to:
> 
> FLAGS.nova_image_staging_dir
> 
> cheers,
> Pádraig.



References