← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1887405] Re: Race condition while processing security_groups_member_updated events (ipset)

 

The fix is available in Ubuntu neutron packages in Hiruste, Groovy,
Focal and UCA Victoria, Ussuri. Marking them as Fix released.

** Changed in: neutron (Ubuntu Hirsute)
       Status: New => Fix Released

** Changed in: neutron (Ubuntu Groovy)
       Status: New => Fix Released

** Changed in: neutron (Ubuntu Focal)
       Status: New => Fix Released

** Changed in: cloud-archive/victoria
       Status: New => Fix Released

** Changed in: cloud-archive/ussuri
       Status: New => Fix Released

** Changed in: cloud-archive
       Status: New => Fix Released

** Changed in: neutron (Ubuntu)
       Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1887405

Title:
  Race condition while processing security_groups_member_updated events
  (ipset)

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive ussuri series:
  Fix Released
Status in Ubuntu Cloud Archive victoria series:
  Fix Released
Status in neutron:
  Fix Released
Status in neutron package in Ubuntu:
  Fix Released
Status in neutron source package in Focal:
  Fix Released
Status in neutron source package in Groovy:
  Fix Released
Status in neutron source package in Hirsute:
  Fix Released

Bug description:
  # Summary

  Race condition while processing security_groups_member_updated events
  (ipset)

  # Overview

  We have a customer that uses heat templates to deploy large
  environments (e.g. 21 instances) with a significant number of security
  groups (e.g. 60) that use bi-directional remote group references for
  both ingress and egress filtering.  These heat stacks are deployed
  using a CI pipeline and intermittently suffer from application layer
  failures due to broken network connectivity.  We found that this was
  caused by the ipsets used to implement remote_group memberships
  missing IPs from their member lists.  Troubleshooting suggests this is
  caused by a race condition, which I've attempted to describe in detail
  below.

  Version: `54e1a6b1bc378c0745afc03987d0fea241b826ae` (HEAD of
  stable/rocky as of Jan 26, 2020), though I suspect this issue persists
  through master.

  I'm working on getting some multi-node environments deployed (I don't
  think it's possible to reproduce this with a single hypervisor) and
  hope to provide reproduction steps on Rocky and master soon.  I wanted
  to get this report submitted as-is with the hopes that an experienced
  Neutron dev might be able to spot possible solutions or provide
  diagnostic insight that I am not yet able to produce.

  I suspect this report may be easier to read with some markdown, so
  please feel free to read it in a gist:
  https://gist.github.com/cfarquhar/20fddf2000a83216021bd15b512f772b

  Also, this diagram is probably critical to following along: https
  ://user-images.githubusercontent.com/1253665/87317744-0a75b180-c4ed-
  11ea-9bad-085019c0f954.png

  # Race condition symptoms

  Given the following security groups/rules:

  ```
  | secgroup name | secgroup id                          | direction | remote group                         | dest port |
  |---------------|--------------------------------------|-----------|--------------------------------------|-----------|
  | server        | fcd6cf12-2ac9-4704-9208-7c6cb83d1a71 | ingress   | b52c8c54-b97a-477d-8b68-f4075e7595d9 | 9092      |
  | client        | b52c8c54-b97a-477d-8b68-f4075e7595d9 | egress    | fcd6cf12-2ac9-4704-9208-7c6cb83d1a71 | 9092      |
  ```

  And the following instances:

  ```
  | instance name | hypervisor | ip          | secgroup assignment |
  |---------------|------------|-------------|---------------------|
  | server01      | compute01  | 192.168.0.1 | server              |
  | server02      | compute02  | 192.168.0.2 | server              |
  | server03      | compute03  | 192.168.0.3 | server              |
  | client01      | compute04  | 192.168.0.4 | client              |
  ```

  We would expect to find the following ipset representing the `server`
  security group members on `compute04`:

  ```
  # ipset list NIPv4fcd6cf12-2ac9-4704-9208-
  Name: NIPv4fcd6cf12-2ac9-4704-9208-
  Type: hash:net
  Revision: 6
  Header: family inet hashsize 1024 maxelem 65536
  Size in memory: 536
  References: 4
  Number of entries: 3
  Members:
  192.168.0.1
  192.168.0.2
  192.168.0.3
  ```

  What we actually get when the race condition is triggered is an
  incomplete list of members in the ipset.  The member list could
  contain anywhere between zero and two of the expected IPs.

  # Triggering the race condition

  The problem occurs when `security_group_member_updated` events arrive between `port_update` steps 12 and 22 (see diagram and process details below).  
    - `port_update` step 12 retrieves the remote security groups' member lists, which are not necessarily complete yet.
    - `port_update` step 22 adds the port to `IptablesFirewallDriver.ports()`.

  This results in `security_group_member_updated` step 3 looking for the
  port to apply the updated member list to (in
  `IptablesFirewallDriver.ports()`) BEFORE it has been added by
  `port_update`'s step 22. This causes the membership update event to
  effectively be discarded.  We are then left with whatever the remote
  security group's member list was when the `port_update` process
  retrieved it at step 12.  This state persists until something triggers
  the port being re-added to the `updated_ports` list (e.g. agent
  restart, another remote group membership change, local security group
  addition/removal, etc).

  
  # Race condition details

  The race condition occurs in the linuxbridge agent between the two following operations:
    1) Processing a `port_update` event when an instance is first created
    2) Processing `security_group_member_updated` events for the instance's remote security groups.

  Either of these operations can result in creating or mutating an ipset
  from `IpsetManager.set_members()`.  The relevant control flow sequence
  for each operation is listed below.  I've left out any branches that
  did not seem to be relevant to the race condition.

  ## Processing a `port_update` event:
    1) We receive an RPC port_update event via `LinuxBridgeRpcCallbacks.port_update()`, which adds the tap device to the `LinuxBridgeRpcCallbacks.updated_devices` list
    2) Sleep until the next `CommonAgentLoop.daemon_loop()` iteration
    3) `CommonAgentLoop.daemon_loop()` calls `CommonAgentLoop.scan_devices()`
    4) `CommonAgentLoop.scan_devices()` calls `LinuxBridgeRpcCallbacks.get_and_clear_updated_devices()` to retrieve and clear the `updated_devices` list from step 1.
    5) `CommonAgentLoop.scan_devices()` performs some calculations and returns control to `CommonAgentLoop.daemon_loop()` along with a list of added or updated devices.
    6) `CommonAgentLoop.daemon_loop()` calls `CommonAgentLoop.process_network_devices()` when there are added or updated devices from step 5.
    7) `CommonAgentLoop.process_network_devices()` calls `SecurityGroupAgentRpc.setup_port_filters()` with the added or updated devices from step 5.
    8) `SecurityGroupAgentRpc.setup_port_filters()` checks for devices that were added to the `SecurityGroupAgentRpc.devices_to_refilter` list (which happens in step 4 of the `security_group_member_updated` process) and appends them to the list of updated devices.  This is where devices from the `security_group_member_updated` process would have been processed if they weren't lost due to the race condition.
    9)  `SecurityGroupAgentRpc.setup_port_filters()` calls `SecurityGroupAgentRpc.prepare_devices_filter()` with the updated port IDs
    10) `SecurityGroupAgentRpc.prepare_devices_filter()` calls `SecurityGroupAgentRpc._apply_port_filter()` with the updated port IDs
    11) `SecurityGroupAgentRpc._apply_port_filter()` calls `SecurityGroupServerRpcApi.security_group_info_for_devices()` with the tap device id
    12) `SecurityGroupServerRpcApi.security_group_info_for_devices()` retrieves detailed information about the port via RPC.  The important detail here is a `sg_member_ids` dict which contains a list of member IPs for each remote security groups applicable to the port.
    13) `SecurityGroupServerRpcApi.security_group_info_for_devices()` returns control to `SecurityGroupAgentRpc._apply_port_filter()` along with the port and security group details.
    14) `SecurityGroupAgentRpc._apply_port_filter()` calls `SecurityGroupAgentRpc._update_security_group_info()` with the remote security groups and their member IP list.
    15) `SecurityGroupAgentRpc._update_security_group_info()` iterates over the remote security groups and calls `IptablesFirewallDriver.update_security_group_members()` for each one
    16) `IptablesFirewallDriver.update_security_group_members()` calls `IptablesFirewallDriver._update_ipset_members()`
    17) `IptablesFirewallDriver._update_ipset_members()` calls `IpsetManager.set_members()`
    18) `IpsetManager.set_members()` calls a number of methods to create or mutate the ipset using linux's `ipset` commands.
    19) The stack unwinds back up to `SecurityGroupAgentRpc._apply_port_filter()` (last seen in step 11)
    20) `SecurityGroupAgentRpc._apply_port_filter()` calls `IptablesFirewallDriver.prepare_port_filter()` with the port
    21) `IptablesFirewallDriver.prepare_port_filter()` calls `IptablesFirewallDriver._set_ports()` with the port details retrieved in step 12
    22) `IptablesFirewallDriver._set_ports()` adds the port to `IptablesFirewallDriver.filtered_ports`, which indicates the port is known to exist on the hypervisor.  It will now be returned by IptablesFirewallDriver.ports().

  ## Processing a `security_group_member_updated` event:
    1) We receive an RPC security_group_member_updated event via `SecurityGroupAgentRpc.security_groups_member_updated()`
    2) `SecurityGroupAgentRpc.security_groups_member_updated()` calls `SecurityGroupAgentRpc._security_group_updated()` with the list of security groups whose membership was updated.
    3) `SecurityGroupAgentRpc._security_group_updated()` consults `IptablesFirewallDriver.ports()` to find ports affected by the membership updates.  In the race condition, `port_update` has not yet reached step 22 to add the port we need to find, so we stop here and the following step does not occur.
    4) `SecurityGroupAgentRpc._security_group_updated()` adds each affected port from the previous step to the `SecurityGroupAgentRpc.devices_to_refilter` list.  This list will be processed next time we reach `port_event` step 8, but in the race condition we never get here.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1887405/+subscriptions


References