yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #76911
[Bug 1580880] Re: [RFE] Distributed Portbinding for all port types
** Changed in: neutron
Assignee: (unassigned) => Miguel Lavalle (minsel)
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1580880
Title:
[RFE] Distributed Portbinding for all port types
Status in neutron:
Fix Released
Bug description:
Summary
=======
Today only DVR ports can be bound to multiple hosts. But having a port bound to
multiple hosts does also make sense for a compute port during live migration.
For a certain period of time the port could be bound to the source and
target at the same time (Although only one is being used). The information of both bindings needs to be set and accessible from Nova via the ReST API.
Use Cases
=========
* Instance in error state when portbinding fails
In the live migration process, port binding is triggered by Nova after
the migration already succeeded. If port binding fails, the instance
is stuck in error state. If portbinding for the target node would be
done in pre_live_migration, migration could be aborted on a binding
failure and the instance would still be active on the migration source
host. But we cannot just do so, as some TOR mech drivers would shut
down the source port after the binding has been updated, although the
instance is still active on the source. If we could bind a compute
port to both hosts, such drivers could keep the source port open, and
already process the target port in parallel.
* Live Migration between hosts running different l2 agents
Another use case is live migration between hosts that run different l2
agents. This requires that Nova updates the instance definition before
migration is executed (in case of libvirt, update the domain.xml with
target interface definition).
A specialized variant of this use case is the migration from an agent
with one firewall driver to another (e.g. from ovs hybrid-fw driver to
new ovs conntrackd firewall driver).
* Live migration with MacVTap agent when different physnet mappings is
used
The third use case is live migration with MacVTap agent. Today it has
some restrictions with live migration in some special scenarios [1].
It requires an update on the instance definition (libvirt domain.xml)
before the migration started.
For updating the definition in time, a portbinding for the migration
target node is required even before the migration started. Along the
argumentation above, we need a compute port bound to multiple hosts.
Proposed Change
===============
* A refactoring of the database is required to make a normal port a
special case of a distributed port. This was planned since a long time
but was never finished. The efforts are tracked via this bug [1]. The
patches still need to be rebased to get that going again.
* ReST API changes are required to externalize the bindings. To not
overload the port API, a new subresource "bindings" could be created
(like /ports/{port-id}/bindings) that holds the list of all bindings.
CREATE/DELETE/UPDATE must be supported. Not UUID or would be required
for this resource, as its identifier would be the host_id!
Nova Changes
============
* In pre_live_migration, nova would add a new binding for the migration target host to the port - this triggers portbinding in Neutron.
* Before migration starts, Nova would access the binding information for the target host. It would abort on "binding_failed" vif type. Otherwise it would modify the instance definition (e.g. domain.xml) for the migration target with this binding information.
* After live migration succeeded, Nova would remove the original port_binding. On Rollback, it would just remove the target port_binding.
Those changes are tracked via the following Nova blueprint [4]
Open Questions
==============
* This RFE is based on bug [1]. How to track those dependencies? Or should the content of this bug become part of this effort?
* Similar with the macvtap live migration bug [3]
* How does this effort correlate to the the RFE for externalizing multi-segment networks [2]?
[1] https://bugs.launchpad.net/neutron/+bug/1367391
[2] https://bugs.launchpad.net/neutron/+bug/1573197
[3] https://bugs.launchpad.net/neutron/+bug/1550400
[4] https://blueprints.launchpad.net/nova/+spec/migration-use-target-vif
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1580880/+subscriptions
References