← Back to team overview

fuel-dev team mailing list archive

Re: Ephemeral RBD with Havana and Dumpling

 

Yes, we were able to live-migrate an instance today. After
re-migration back to the original node the instance began reporting
weird I/O errors on some commands, Ryan is re-testing to check if the
same problem occurs again or was a Cirros specific fluke.

Here's our task list based on research so far:
1) patch Nova to add CoW from images to instance boot drives as per OSCI-773.
2) patch Nova to disable shared filesystem check for live migration of
non-volume-backed instances (we have a hack in place, I'm working on a
proper patch)
3) patch Nova remove 'rbd ls' from the rbd driver as per Ceph #6693
found by Andrey K.
4) patch Ceph manifests to create new 'compute' Ceph user, keyring,
and pool for Nova (we tested with the images user so far), and to use
the 'compute' user instead of 'volumes' when defining the libvirt
secret.
5) figure out tls and tcp auth configuration for libvirt: we had to
disable it to make live migrations work, have to investigate how to
make them work in a more secure configuration, patch Ceph manifests
accordingly
6) patch Ceph manifests to modify nova.conf (enable RBD backend,
configure Ceph pool and user credentials, etc.)
7) patch OpenStack manifests to open libvirt qemu/kvm live migration
ports between compute nodes, report Nova bug about live migration
being silently cancelled without reporting the libvirt failure to
connect.

Can anyone help with item (5) above?


On Tue, Nov 19, 2013 at 2:53 AM, Mike Scherbakov
<mscherbakov@xxxxxxxxxxxx> wrote:
> I'd like to keep all the issues on the subject in the single email thread,
> so here is what I copy-pasted from A.Korolev:
>> http://tracker.ceph.com/issues/6693
>
> Also, I don't see any reason for keeping this conversation private, so I'm
> adding fuel-dev.
>
> Dmitry - any successes so far in your research?
>
>
> On Tue, Nov 19, 2013 at 1:53 AM, Dmitry Borodaenko
> <dborodaenko@xxxxxxxxxxxx> wrote:
>>
>> The reason it's not a limitation for a volume backed instance is this
>> misguided conditional:
>>
>> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3922
>>
>> It assumes that only a volume-backed instance without ephemeral disks
>> can be live-migrated without shared storage. I also found many other
>> places in live migration code in Nova making the same assumption. What
>> I did not find so far is any real reason for shared storage to be
>> required for anything other than backing the instance's boot drive,
>> which is no longer a concern with the Ephemeral RBD patch. I'll try to
>> disable this and other similar checks and see if that makes live
>> migration work for an instance backed by RBD.
>>
>> If that's the case and there are no other blockers in nova, libvirt or
>> qemu, fixing this in Nova will indeed be relatively straightforward.
>>
>> -Dmitry
>>
>> On Mon, Nov 18, 2013 at 9:37 AM, Mike Scherbakov
>> <mscherbakov@xxxxxxxxxxxx> wrote:
>> > If instance boots from volume, Nova should not have such a limitation.
>> > So if
>> > it has, it might be easier to fix Nova instead.
>> >
>> >
>> > On Mon, Nov 18, 2013 at 8:56 PM, Dmitry Borodaenko
>> > <dborodaenko@xxxxxxxxxxxx> wrote:
>> >>
>> >> I used patched packages built by OSCI team per Jira OSCI-773, there are
>> >> two more patches on the branch mentioned in the thread on ceph-users, I
>> >> still need to review and test those.
>> >>
>> >> We have seen the same error reported on this thread about shared
>> >> storage,
>> >> Nova requires /var/lib/nova to be shared between all compute nodes for
>> >> live
>> >> migrations, I am still waiting for Haomai to confirm whether he was
>> >> able to
>> >> overcome this limitation. If not, we will have to add glusterfs or
>> >> cephfs,
>> >> which is too much work for 4.0 timeframe.
>> >>
>> >> On Nov 18, 2013 1:32 AM, "Mike Scherbakov" <mscherbakov@xxxxxxxxxxxx>
>> >> wrote:
>> >>>
>> >>> Dmitry - sorry for late response.
>> >>> It is good news - I remember time when we were experimenting with
>> >>> DRBD,
>> >>> and now we will have Ceph, which should be a way better for the
>> >>> purposes we
>> >>> need it for.
>> >>>
>> >>> > works with the patched Nova packages
>> >>> What patches did you apply? OSCI team already aware?
>> >>>
>> >>> As we merged havana into master, what are your estimates on enabling
>> >>> all
>> >>> of this? We had meeting w/Roman, David, and we really want to have
>> >>> live
>> >>> migration enabled in 4.0 (see #6 here:
>> >>>
>> >>> https://mirantis.jira.com/wiki/display/PRD/4.0+-+Mirantis+OpenStack+release+home+page)
>> >>>
>> >>> Thanks,
>> >>>
>> >>>
>> >>> On Wed, Nov 13, 2013 at 12:39 AM, Dmitry Borodaenko
>> >>> <dborodaenko@xxxxxxxxxxxx> wrote:
>> >>>>
>> >>>> Ephemeral storage in Ceph works with the patched Nova packages, we
>> >>>> can
>> >>>> start updating our Ceph manifests as soon as we have havana branch
>> >>>> merged into fuel master!
>> >>>>
>> >>>> ---------- Forwarded message ----------
>> >>>> From: Dmitry Borodaenko <dborodaenko@xxxxxxxxxxxx>
>> >>>> Date: Tue, Nov 12, 2013 at 12:38 PM
>> >>>> Subject: Re: Ephemeral RBD with Havana and Dumpling
>> >>>> To: ceph-users@xxxxxxxxxxxxxx
>> >>>>
>> >>>>
>> >>>> And to answer my own question, I was missing a meaningful error
>> >>>> message: what the ObjectNotFound exception I got from librados didn't
>> >>>> tell me was that I didn't have the images keyring file in /etc/ceph/
>> >>>> on my compute node. After 'ceph auth get-or-create client.images >
>> >>>> /etc/ceph/ceph.client.images.keyring' and reverting images caps back
>> >>>> to original state, it all works!
>> >>>>
>> >>>> On Tue, Nov 12, 2013 at 12:19 PM, Dmitry Borodaenko
>> >>>> <dborodaenko@xxxxxxxxxxxx> wrote:
>> >>>> > I can get ephemeral storage for Nova to work with RBD backend, but
>> >>>> > I
>> >>>> > don't understand why it only works with the admin cephx user? With
>> >>>> > a
>> >>>> > different user starting a VM fails, even if I set its caps to
>> >>>> > 'allow
>> >>>> > *'.
>> >>>> >
>> >>>> > Here's what I have in nova.conf:
>> >>>> > libvirt_images_type=rbd
>> >>>> > libvirt_images_rbd_pool=images
>> >>>> > rbd_secret_uuid=fd9a11cc-6995-10d7-feb4-d338d73a4399
>> >>>> > rbd_user=images
>> >>>> >
>> >>>> > The secret UUID is defined following the same steps as for Cinder
>> >>>> > and
>> >>>> > Glance:
>> >>>> > http://ceph.com/docs/master/rbd/libvirt/
>> >>>> >
>> >>>> > BTW rbd_user option doesn't seem to be documented anywhere, is that
>> >>>> > a
>> >>>> > documentation bug?
>> >>>> >
>> >>>> > And here's what 'ceph auth list' tells me about my cephx users:
>> >>>> >
>> >>>> > client.admin
>> >>>> >         key: AQCoSX1SmIo0AxAAnz3NffHCMZxyvpz65vgRDg==
>> >>>> >         caps: [mds] allow
>> >>>> >         caps: [mon] allow *
>> >>>> >         caps: [osd] allow *
>> >>>> > client.images
>> >>>> >         key: AQC1hYJS0LQhDhAAn51jxI2XhMaLDSmssKjK+g==
>> >>>> >         caps: [mds] allow
>> >>>> >         caps: [mon] allow *
>> >>>> >         caps: [osd] allow *
>> >>>> > client.volumes
>> >>>> >         key: AQALSn1ScKruMhAAeSETeatPLxTOVdMIt10uRg==
>> >>>> >         caps: [mon] allow r
>> >>>> >         caps: [osd] allow class-read object_prefix rbd_children,
>> >>>> > allow
>> >>>> > rwx pool=volumes, allow rx pool=images
>> >>>> >
>> >>>> > Setting rbd_user to images or volumes doesn't work.
>> >>>> >
>> >>>> > What am I missing?
>> >>>> >
>> >>>> > Thanks,
>> >>>> >
>> >>>> > --
>> >>>> > Dmitry Borodaenko
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Dmitry Borodaenko
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Dmitry Borodaenko
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Mike Scherbakov
>> >
>> >
>> >
>> >
>> > --
>> > Mike Scherbakov
>>
>>
>>
>> --
>> Dmitry Borodaenko
>
>
>
>
> --
> Mike Scherbakov



-- 
Dmitry Borodaenko


Follow ups

References