← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1734320] Re: Eavesdropping private traffic

 

** Changed in: neutron
       Status: Fix Committed => Confirmed

** Changed in: neutron
       Status: Confirmed => New

** Changed in: nova
       Status: In Progress => Fix Released

** Changed in: neutron
       Status: New => Incomplete

** Changed in: neutron
       Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1734320

Title:
  Eavesdropping private traffic

Status in neutron:
  Confirmed
Status in OpenStack Compute (nova):
  Fix Released
Status in os-vif:
  Fix Released
Status in OpenStack Security Advisory:
  Won't Fix

Bug description:
  Eavesdropping private traffic
  =============================

  Abstract
  --------

  We've discovered a security issue that allows end users within their
  own private network to receive from, and send traffic to, other
  private networks on the same compute node.

  Description
  -----------

  During live-migration there is a small time window where the ports of
  instances are untagged. Instances have a port trunked to the
  integration bridge and receive 802.1Q tagged private traffic from
  other tenants.

  If the port is administratively down during live migration, the port
  will remain in trunk mode indefinitely.

  Traffic is possible between ports is that are administratively down,
  even between tenants self-service networks.

  Conditions
  ----------

  The following conditions are necessary.

  * Openvswitch Self-service networks
  * An Openstack administrator or an automated process needs to schedule a Live migration

  We tested this on newton.

  Issues
  ------

  This outcome is the result of multiple independent issues. We will
  list the most important first, and follow with bugs that create a
  fragile situation.

  Issue #1 Initially creating a trunk port

  When the port is initially created, it is in trunk mode. This creates a fail-open situation.
  See: https://github.com/openstack/os-vif/blob/newton-eol/vif_plug_ovs/linux_net.py#L52
  Recommendation: create ports in the port_dead state, don't leave it dangling in trunk-mode. Add a drop-flow initially.

  Issue #2 Order of creation.

  The instance is actually migrated before the (networking)
  configuration is completed.

  Recommendation: wait with finishing the live migration until the
  underlying configuration has been applied completely.

  Issue #3 Not closing the port when it is down.

  Neutron calls the port_dead function to ensure the port is down. It
  sets the tag to 4095 and adds a "drop" flow if (and only if) there is
  already another tag on the port. The port_dead function will keep
  untagged ports untagged.

  https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L995

  Recommendation: Make port_dead also shut the port if no tag is found.
  Log a warning if this happens.

  Issue #4 Putting the port administratively down actually puts the port
  on a compute node shared vlan

  Instances from different projects on different private networks can
  talk to each other if they put their ports down. The code does install
  an openflow "drop" rule but it has a lower priority (2) than the allow
  rules.

  Recommendation: Increase the port_dead openflow drop rule priority to
  MAX

  Timeline
  --------

   2017-09-14 Discovery eavesdropping issue
   2017-09-15 Verify workaround.
   2017-10-04 Discovery port-down-traffic issue
   2017-11-24 Vendor Disclosure to Openstack

  Steps to reproduce
  ------------------

  1. Attach an instance to two networks:

  admin$ openstack server create --nic net-id=<net-uuid1> --nic net-id
  =<net-uuid2> --image <image_id> --flavor <flavor_id> instance_temp

  2. Attach a FIP to the instance to be able to log in to this instance

  3. Verify:

  admin$ openstack server show -c name -c addresses fe28a2ee-098f-4425
  -9d3c-8e2cd383547d

  +-----------+-----------------------------------------------------------------------------+
  | Field     | Value                                                                       |
  +-----------+-----------------------------------------------------------------------------+
  | addresses | network1=192.168.99.8, <FIP>; network2=192.168.80.14                        |
  | name      | instance_temp                                                               |
  +-----------+-----------------------------------------------------------------------------+

  4. Ssh to the instance using network1 and run a tcpdump on the other
  port network2

  [root@instance_temp]$ tcpdump -eeenni eth1

  5. Get port-id of network2

  admin$ nova interface-list fe28a2ee-098f-4425-9d3c-8e2cd383547d
  +------------+--------------------------------------+--------------------------------------+---------------+-------------------+
  | Port State | Port ID                              | Net ID                               | IP addresses  | MAC Addr          |
  +------------+--------------------------------------+--------------------------------------+---------------+-------------------+
  | ACTIVE     | a848520b-0814-4030-bb48-49e4b5cf8160 | d69028f7-9558-4f14-8ce6-29cb8f1c19cd | 192.168.80.14 | fa:16:3e:2d:8b:7b |
  | ACTIVE     | fad148ca-cf7a-4839-aac3-a2cd8d1d2260 | d22c22ae-0a42-4e3b-8144-f28534c3439a | 192.168.99.8  | fa:16:3e:60:2c:fa |
  +------------+--------------------------------------+--------------------------------------+---------------+-------------------+

  6. Force port down on network 2

  admin$ neutron port-update a848520b-0814-4030-bb48-49e4b5cf8160
  --admin-state-up False

  7. Port gets tagged with vlan 4095, the dead vlan tag, which is
  normal:

  compute1# grep a848520b-0814-4030-bb48-49e4b5cf8160 /var/log/neutron/neutron-openvswitch-agent.log | tail -1
  INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-e008feb3-8a35-4c97-adac-b48ff88165b2 - - - - -] VIF port: a848520b-0814-4030-bb48-49e4b5cf8160 admin state up disabled, putting on the dead VLAN

  8. Verify the port is tagged with vlan 4095

  compute1# ovs-vsctl show | grep -A3 qvoa848520b-08
        Port "qvoa848520b-08"
            tag: 4095
            Interface "qvoa848520b-08"

  9. Now live-migrate the instance:

  admin# nova live-migration fe28a2ee-098f-4425-9d3c-8e2cd383547d

  10. Verify the tag is gone on compute2, and take a deep breath

  compute2# ovs-vsctl show | grep -A3 qvoa848520b-08
        Port "qvoa848520b-08"
            Interface "qvoa848520b-08"
        Port...
  compute2# echo "Wut!"

  11. Now traffic of all other self-service networks present on compute2
  can be sniffed from instance_temp

  [root@instance_temp] tcpdump -eenni eth1
  13:14:31.748266 fa:16:3e:6a:17:38 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.152, length 28
  13:14:31.804573 fa:16:3e:e8:a2:d2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.70, length 28
  13:14:31.810482 fa:16:3e:95:ca:3a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.154, length 28
  13:14:31.977820 fa:16:3e:6f:f4:9b > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.150, length 28
  13:14:31.979590 fa:16:3e:0f:3d:cc > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 9, p 0, ethertype ARP, Request who-has 10.103.9.163 tell 10.103.9.1, length 28
  13:14:32.048082 fa:16:3e:65:64:38 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.101, length 28
  13:14:32.127400 fa:16:3e:30:cb:b5 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.165, length 28
  13:14:32.141982 fa:16:3e:96:cd:b0 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.100, length 28
  13:14:32.205327 fa:16:3e:a2:0b:76 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.153, length 28
  13:14:32.444142 fa:16:3e:1f:db:ed > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 58: vlan 72, p 0, ethertype IPv4, 192.168.99.212 > 224.0.0.18: VRRPv2, Advertisement, vrid 50, prio 103, authtype none, intvl 1s, length 20
  13:14:32.449497 fa:16:3e:1c:24:c0 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.20, length 28
  13:14:32.476015 fa:16:3e:f2:3b:97 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.22, length 28
  13:14:32.575034 fa:16:3e:44:fe:35 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.163, length 28
  13:14:32.676185 fa:16:3e:1e:92:d7 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.150, length 28
  13:14:32.711755 fa:16:3e:99:6c:c8 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 62: vlan 10, p 0, ethertype IPv4, 10.103.12.154 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 49, authtype simple, intvl 1s, length 24
  13:14:32.711773 fa:16:3e:f5:23:d5 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 58: vlan 12, p 0, ethertype IPv4, 10.103.15.154 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 49, authtype simple, intvl 1s, length 20

  Workaround
  ----------

  We temporary fixed this issue by forcing the dead vlan tag on port
  creation on compute nodes:

  /usr/lib/python2.7/site-packages/vif_plug_ovs/linux_net.py:

  def _create_ovs_vif_cmd(bridge, dev, iface_id, mac,
                          instance_id, interface_type=None,
                          vhost_server_path=None):
  +   # ODCN: initialize port as dead
  +   # ODCN: TODO: set drop flow
      cmd = ['--', '--if-exists', 'del-port', dev, '--',
              'add-port', bridge, dev,
  +           'tag=4095',
              '--', 'set', 'Interface', dev,
              'external-ids:iface-id=%s' % iface_id,
              'external-ids:iface-status=active',
              'external-ids:attached-mac=%s' % mac,
              'external-ids:vm-uuid=%s' % instance_id]
      if interface_type:
          cmd += ['type=%s' % interface_type]
      if vhost_server_path:
          cmd += ['options:vhost-server-path=%s' % vhost_server_path]
      return cmd

  https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L995

      def port_dead(self, port, log_errors=True):
          '''Once a port has no binding, put it on the "dead vlan".

          :param port: an ovs_lib.VifPort object.
          '''
          # Don't kill a port if it's already dead
          cur_tag = self.int_br.db_get_val("Port", port.port_name, "tag",
                                           log_errors=log_errors)
  +       # ODCN GM 20170915
  +       if not cur_tag:
  +           LOG.error('port_dead(): port %s has no tag', port.port_name)
  +       # ODCN AJS 20170915
  +       if not cur_tag or cur_tag != constants.DEAD_VLAN_TAG:
  -       if cur_tag and cur_tag != constants.DEAD_VLAN_TAG:
             LOG.info('port_dead(): put port %s on dead vlan', port.port_name)
             self.int_br.set_db_attribute("Port", port.port_name, "tag",
                                           constants.DEAD_VLAN_TAG,
                                           log_errors=log_errors)
              self.int_br.drop_port(in_port=port.ofport)

  plugins/ml2/drivers/openvswitch/agent/openflow/ovs_ofctl/ovs_bridge.py
      def drop_port(self, in_port):
  +        # ODCN AJS 20171004:
  -       self.install_drop(priority=2, in_port=in_port)
  +       self.install_drop(priority=65535, in_port=in_port)

  Regards,

  ODC Noord.
  Gerhard Muntingh
  Albert Siersema
  Paul Peereboom

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1734320/+subscriptions