← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1493341] [NEW] l2 pop failed if live-migrate a VM with multiple neutron-server workers

 

Public bug reported:

Now if we set neutron-server with 2 more workers or two neutron-server node behind a loadbalancer, then we live-migrate a VM will 
cause l2 pop failed(not always), the reason is that:
1. when nova finish live-migrating a VM, it update port host id to destination host
2. one neutron-server worker receive this request and do l2 pop, it check this port's host id was changed, but status is ACTIVE, then it
   record this port to its memory
3. when l2 agent scans this port, and update this port's status from ACTIVE->BUILD-ACTIVE, but another neutron-server worker    receive this RPC request, then l2 pop will fail for this port 


    def update_port_postcommit(self, context):
        ...
        if port['device_owner'] == const.DEVICE_OWNER_DVR_INTERFACE:
            if context.status == const.PORT_STATUS_ACTIVE:
                self._update_port_up(context)
            if context.status == const.PORT_STATUS_DOWN:
                agent_host = context.host
                fdb_entries = self._get_agent_fdb(
                        context, port, agent_host)
                self.L2populationAgentNotify.remove_fdb_entries(
                    self.rpc_ctx, fdb_entries)
        elif (context.host != context.original_host
            and context.status == const.PORT_STATUS_ACTIVE
            and not self.migrated_ports.get(orig['id'])):
            # The port has been migrated. We have to store the original
            # binding to send appropriate fdb once the port will be set
            # on the destination host
            self.migrated_ports[orig['id']] = (
                (orig, context.original_host))

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1493341

Title:
  l2 pop failed if live-migrate a VM with multiple neutron-server
  workers

Status in neutron:
  New

Bug description:
  Now if we set neutron-server with 2 more workers or two neutron-server node behind a loadbalancer, then we live-migrate a VM will 
  cause l2 pop failed(not always), the reason is that:
  1. when nova finish live-migrating a VM, it update port host id to destination host
  2. one neutron-server worker receive this request and do l2 pop, it check this port's host id was changed, but status is ACTIVE, then it
     record this port to its memory
  3. when l2 agent scans this port, and update this port's status from ACTIVE->BUILD-ACTIVE, but another neutron-server worker    receive this RPC request, then l2 pop will fail for this port 

  
      def update_port_postcommit(self, context):
          ...
          if port['device_owner'] == const.DEVICE_OWNER_DVR_INTERFACE:
              if context.status == const.PORT_STATUS_ACTIVE:
                  self._update_port_up(context)
              if context.status == const.PORT_STATUS_DOWN:
                  agent_host = context.host
                  fdb_entries = self._get_agent_fdb(
                          context, port, agent_host)
                  self.L2populationAgentNotify.remove_fdb_entries(
                      self.rpc_ctx, fdb_entries)
          elif (context.host != context.original_host
              and context.status == const.PORT_STATUS_ACTIVE
              and not self.migrated_ports.get(orig['id'])):
              # The port has been migrated. We have to store the original
              # binding to send appropriate fdb once the port will be set
              # on the destination host
              self.migrated_ports[orig['id']] = (
                  (orig, context.original_host))

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1493341/+subscriptions