yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #50763
[Bug 1580880] [NEW] [RFE] Distributed Portbinding for all port tpyes
Public bug reported:
Summary
=======
Today only DVR ports can be bound to multiple hosts. But having a port bound to
multiple hosts does also make sense for a compute port during live migration.
For a certain period of time the port could be bound to the source and
target at the same time (Although only one is being used). The information of both bindings needs to be accessible from Nova via the ReST API.
Use Cases
=========
* Instance in error state when portbinding fails
In the live migration process, port binding is triggered by Nova after
the migration already succeeded. If port binding fails, the instance is
stuck in error state. If portbinding for the target node would be done in
pre_live_migration, migration could be aborted on a binding failure and the
instance would still be active on the migration source host. But we
cannot just do so, as some TOR mech drivers would shut down the source
port after the binding has been updated, although the instance is still
active on the source. If we could bind a compute port to both hosts, such
drivers could keep the source port open, and already process the target
port in parallel.
* Live Migration between hosts running different l2 agents
Another use case is live migration between hosts that run different l2
agents. This requires that Nova updates the instance definition before
migration is executed (in case of libvirt, update the domain.xml with
target interface definition).
A specialized variant of this use case is the migration from an agent with
one firewall driver to another (e.g. from ovs hybrid-fw driver to new ovs
conntrackd firewall driver).
* Live migration with MacVTap agent when different physnet mappings is
used
The third use case is live migration with MacVTap agent. Today it has some
restrictions with live migration in some special scenarios [1]. It requires
an update on the instance definition (libvirt domain.xml) before the
migration started.
For updating the definition in time, a portbinding for the migration target
node is required even before the migration started. Along the argumentation
above, we need a compute port bound to multiple hosts.
Proposed Change
===============
* A refactoring of the database is required to make a normal port a
special case of a distributed port. This was planned since a long time
but was never finished. The efforts are tracked via this bug [1]. The
patches still need to be rebased to get that going again.
* ReST API changes are required to externalize the bindings. To not
overload the port API, a new subresource "bindings" could be created
(like /ports/{port-id}/bindings) that holds the list of all bindings.
CREATE/DELETE/UPDATE must be supported. Not UUID or would be required
for this resource, as its identifier would be the host_id!
Nova Changes
============
* In pre_live_migration, nova would add a new binding for the migration target host to the port - this triggers portbinding in Neutron.
* Before migration starts, Nova would access the binding information for the target host. It would abort on "binding_failed" vif type. Otherwise it would modify the instance definition (e.g. domain.xml) for the migration target with this binding information.
* After live migration succeeded, Nova would remove the original port_binding. On Rollback, it would just remove the target port_binding.
Those changes are tracked via the following Nova blueprint [4]
Open Questions
==============
* This RFE is based on bug [1]. How to track those dependencies? Or should the content of this bug become part of this effort?
* Similar with the macvtap live migration bug [3]
* How does this effort correlate to the the RFE for externalizing multi-segment networks [2]?
[1] https://bugs.launchpad.net/neutron/+bug/1367391
[2] https://bugs.launchpad.net/neutron/+bug/1573197
[3] https://bugs.launchpad.net/neutron/+bug/1550400
[4] https://blueprints.launchpad.net/nova/+spec/migration-use-target-vif
** Affects: neutron
Importance: Undecided
Status: New
** Tags: rfe
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1580880
Title:
[RFE] Distributed Portbinding for all port tpyes
Status in neutron:
New
Bug description:
Summary
=======
Today only DVR ports can be bound to multiple hosts. But having a port bound to
multiple hosts does also make sense for a compute port during live migration.
For a certain period of time the port could be bound to the source and
target at the same time (Although only one is being used). The information of both bindings needs to be accessible from Nova via the ReST API.
Use Cases
=========
* Instance in error state when portbinding fails
In the live migration process, port binding is triggered by Nova after
the migration already succeeded. If port binding fails, the instance is
stuck in error state. If portbinding for the target node would be done in
pre_live_migration, migration could be aborted on a binding failure and the
instance would still be active on the migration source host. But we
cannot just do so, as some TOR mech drivers would shut down the source
port after the binding has been updated, although the instance is still
active on the source. If we could bind a compute port to both hosts, such
drivers could keep the source port open, and already process the target
port in parallel.
* Live Migration between hosts running different l2 agents
Another use case is live migration between hosts that run different l2
agents. This requires that Nova updates the instance definition before
migration is executed (in case of libvirt, update the domain.xml with
target interface definition).
A specialized variant of this use case is the migration from an agent with
one firewall driver to another (e.g. from ovs hybrid-fw driver to new ovs
conntrackd firewall driver).
* Live migration with MacVTap agent when different physnet mappings is
used
The third use case is live migration with MacVTap agent. Today it has some
restrictions with live migration in some special scenarios [1]. It requires
an update on the instance definition (libvirt domain.xml) before the
migration started.
For updating the definition in time, a portbinding for the migration target
node is required even before the migration started. Along the argumentation
above, we need a compute port bound to multiple hosts.
Proposed Change
===============
* A refactoring of the database is required to make a normal port a
special case of a distributed port. This was planned since a long time
but was never finished. The efforts are tracked via this bug [1]. The
patches still need to be rebased to get that going again.
* ReST API changes are required to externalize the bindings. To not
overload the port API, a new subresource "bindings" could be created
(like /ports/{port-id}/bindings) that holds the list of all bindings.
CREATE/DELETE/UPDATE must be supported. Not UUID or would be required
for this resource, as its identifier would be the host_id!
Nova Changes
============
* In pre_live_migration, nova would add a new binding for the migration target host to the port - this triggers portbinding in Neutron.
* Before migration starts, Nova would access the binding information for the target host. It would abort on "binding_failed" vif type. Otherwise it would modify the instance definition (e.g. domain.xml) for the migration target with this binding information.
* After live migration succeeded, Nova would remove the original port_binding. On Rollback, it would just remove the target port_binding.
Those changes are tracked via the following Nova blueprint [4]
Open Questions
==============
* This RFE is based on bug [1]. How to track those dependencies? Or should the content of this bug become part of this effort?
* Similar with the macvtap live migration bug [3]
* How does this effort correlate to the the RFE for externalizing multi-segment networks [2]?
[1] https://bugs.launchpad.net/neutron/+bug/1367391
[2] https://bugs.launchpad.net/neutron/+bug/1573197
[3] https://bugs.launchpad.net/neutron/+bug/1550400
[4] https://blueprints.launchpad.net/nova/+spec/migration-use-target-vif
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1580880/+subscriptions
Follow ups