yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #76520
[Bug 1734320] Re: Eavesdropping private traffic
** Changed in: os-vif
Status: Fix Committed => Fix Released
** Changed in: neutron
Status: In Progress => Fix Committed
** Changed in: nova
Status: In Progress => Won't Fix
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1734320
Title:
Eavesdropping private traffic
Status in neutron:
Fix Committed
Status in OpenStack Compute (nova):
Won't Fix
Status in os-vif:
Fix Released
Status in OpenStack Security Advisory:
Won't Fix
Bug description:
Eavesdropping private traffic
=============================
Abstract
--------
We've discovered a security issue that allows end users within their
own private network to receive from, and send traffic to, other
private networks on the same compute node.
Description
-----------
During live-migration there is a small time window where the ports of
instances are untagged. Instances have a port trunked to the
integration bridge and receive 802.1Q tagged private traffic from
other tenants.
If the port is administratively down during live migration, the port
will remain in trunk mode indefinitely.
Traffic is possible between ports is that are administratively down,
even between tenants self-service networks.
Conditions
----------
The following conditions are necessary.
* Openvswitch Self-service networks
* An Openstack administrator or an automated process needs to schedule a Live migration
We tested this on newton.
Issues
------
This outcome is the result of multiple independent issues. We will
list the most important first, and follow with bugs that create a
fragile situation.
Issue #1 Initially creating a trunk port
When the port is initially created, it is in trunk mode. This creates a fail-open situation.
See: https://github.com/openstack/os-vif/blob/newton-eol/vif_plug_ovs/linux_net.py#L52
Recommendation: create ports in the port_dead state, don't leave it dangling in trunk-mode. Add a drop-flow initially.
Issue #2 Order of creation.
The instance is actually migrated before the (networking)
configuration is completed.
Recommendation: wait with finishing the live migration until the
underlying configuration has been applied completely.
Issue #3 Not closing the port when it is down.
Neutron calls the port_dead function to ensure the port is down. It
sets the tag to 4095 and adds a "drop" flow if (and only if) there is
already another tag on the port. The port_dead function will keep
untagged ports untagged.
https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L995
Recommendation: Make port_dead also shut the port if no tag is found.
Log a warning if this happens.
Issue #4 Putting the port administratively down actually puts the port
on a compute node shared vlan
Instances from different projects on different private networks can
talk to each other if they put their ports down. The code does install
an openflow "drop" rule but it has a lower priority (2) than the allow
rules.
Recommendation: Increase the port_dead openflow drop rule priority to
MAX
Timeline
--------
2017-09-14 Discovery eavesdropping issue
2017-09-15 Verify workaround.
2017-10-04 Discovery port-down-traffic issue
2017-11-24 Vendor Disclosure to Openstack
Steps to reproduce
------------------
1. Attach an instance to two networks:
admin$ openstack server create --nic net-id=<net-uuid1> --nic net-id
=<net-uuid2> --image <image_id> --flavor <flavor_id> instance_temp
2. Attach a FIP to the instance to be able to log in to this instance
3. Verify:
admin$ openstack server show -c name -c addresses fe28a2ee-098f-4425
-9d3c-8e2cd383547d
+-----------+-----------------------------------------------------------------------------+
| Field | Value |
+-----------+-----------------------------------------------------------------------------+
| addresses | network1=192.168.99.8, <FIP>; network2=192.168.80.14 |
| name | instance_temp |
+-----------+-----------------------------------------------------------------------------+
4. Ssh to the instance using network1 and run a tcpdump on the other
port network2
[root@instance_temp]$ tcpdump -eeenni eth1
5. Get port-id of network2
admin$ nova interface-list fe28a2ee-098f-4425-9d3c-8e2cd383547d
+------------+--------------------------------------+--------------------------------------+---------------+-------------------+
| Port State | Port ID | Net ID | IP addresses | MAC Addr |
+------------+--------------------------------------+--------------------------------------+---------------+-------------------+
| ACTIVE | a848520b-0814-4030-bb48-49e4b5cf8160 | d69028f7-9558-4f14-8ce6-29cb8f1c19cd | 192.168.80.14 | fa:16:3e:2d:8b:7b |
| ACTIVE | fad148ca-cf7a-4839-aac3-a2cd8d1d2260 | d22c22ae-0a42-4e3b-8144-f28534c3439a | 192.168.99.8 | fa:16:3e:60:2c:fa |
+------------+--------------------------------------+--------------------------------------+---------------+-------------------+
6. Force port down on network 2
admin$ neutron port-update a848520b-0814-4030-bb48-49e4b5cf8160
--admin-state-up False
7. Port gets tagged with vlan 4095, the dead vlan tag, which is
normal:
compute1# grep a848520b-0814-4030-bb48-49e4b5cf8160 /var/log/neutron/neutron-openvswitch-agent.log | tail -1
INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-e008feb3-8a35-4c97-adac-b48ff88165b2 - - - - -] VIF port: a848520b-0814-4030-bb48-49e4b5cf8160 admin state up disabled, putting on the dead VLAN
8. Verify the port is tagged with vlan 4095
compute1# ovs-vsctl show | grep -A3 qvoa848520b-08
Port "qvoa848520b-08"
tag: 4095
Interface "qvoa848520b-08"
9. Now live-migrate the instance:
admin# nova live-migration fe28a2ee-098f-4425-9d3c-8e2cd383547d
10. Verify the tag is gone on compute2, and take a deep breath
compute2# ovs-vsctl show | grep -A3 qvoa848520b-08
Port "qvoa848520b-08"
Interface "qvoa848520b-08"
Port...
compute2# echo "Wut!"
11. Now traffic of all other self-service networks present on compute2
can be sniffed from instance_temp
[root@instance_temp] tcpdump -eenni eth1
13:14:31.748266 fa:16:3e:6a:17:38 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.152, length 28
13:14:31.804573 fa:16:3e:e8:a2:d2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.70, length 28
13:14:31.810482 fa:16:3e:95:ca:3a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.154, length 28
13:14:31.977820 fa:16:3e:6f:f4:9b > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.150, length 28
13:14:31.979590 fa:16:3e:0f:3d:cc > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 9, p 0, ethertype ARP, Request who-has 10.103.9.163 tell 10.103.9.1, length 28
13:14:32.048082 fa:16:3e:65:64:38 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.101, length 28
13:14:32.127400 fa:16:3e:30:cb:b5 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.165, length 28
13:14:32.141982 fa:16:3e:96:cd:b0 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.100, length 28
13:14:32.205327 fa:16:3e:a2:0b:76 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.153, length 28
13:14:32.444142 fa:16:3e:1f:db:ed > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 58: vlan 72, p 0, ethertype IPv4, 192.168.99.212 > 224.0.0.18: VRRPv2, Advertisement, vrid 50, prio 103, authtype none, intvl 1s, length 20
13:14:32.449497 fa:16:3e:1c:24:c0 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.20, length 28
13:14:32.476015 fa:16:3e:f2:3b:97 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.22, length 28
13:14:32.575034 fa:16:3e:44:fe:35 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.163, length 28
13:14:32.676185 fa:16:3e:1e:92:d7 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.150, length 28
13:14:32.711755 fa:16:3e:99:6c:c8 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 62: vlan 10, p 0, ethertype IPv4, 10.103.12.154 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 49, authtype simple, intvl 1s, length 24
13:14:32.711773 fa:16:3e:f5:23:d5 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 58: vlan 12, p 0, ethertype IPv4, 10.103.15.154 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 49, authtype simple, intvl 1s, length 20
Workaround
----------
We temporary fixed this issue by forcing the dead vlan tag on port
creation on compute nodes:
/usr/lib/python2.7/site-packages/vif_plug_ovs/linux_net.py:
def _create_ovs_vif_cmd(bridge, dev, iface_id, mac,
instance_id, interface_type=None,
vhost_server_path=None):
+ # ODCN: initialize port as dead
+ # ODCN: TODO: set drop flow
cmd = ['--', '--if-exists', 'del-port', dev, '--',
'add-port', bridge, dev,
+ 'tag=4095',
'--', 'set', 'Interface', dev,
'external-ids:iface-id=%s' % iface_id,
'external-ids:iface-status=active',
'external-ids:attached-mac=%s' % mac,
'external-ids:vm-uuid=%s' % instance_id]
if interface_type:
cmd += ['type=%s' % interface_type]
if vhost_server_path:
cmd += ['options:vhost-server-path=%s' % vhost_server_path]
return cmd
https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L995
def port_dead(self, port, log_errors=True):
'''Once a port has no binding, put it on the "dead vlan".
:param port: an ovs_lib.VifPort object.
'''
# Don't kill a port if it's already dead
cur_tag = self.int_br.db_get_val("Port", port.port_name, "tag",
log_errors=log_errors)
+ # ODCN GM 20170915
+ if not cur_tag:
+ LOG.error('port_dead(): port %s has no tag', port.port_name)
+ # ODCN AJS 20170915
+ if not cur_tag or cur_tag != constants.DEAD_VLAN_TAG:
- if cur_tag and cur_tag != constants.DEAD_VLAN_TAG:
LOG.info('port_dead(): put port %s on dead vlan', port.port_name)
self.int_br.set_db_attribute("Port", port.port_name, "tag",
constants.DEAD_VLAN_TAG,
log_errors=log_errors)
self.int_br.drop_port(in_port=port.ofport)
plugins/ml2/drivers/openvswitch/agent/openflow/ovs_ofctl/ovs_bridge.py
def drop_port(self, in_port):
+ # ODCN AJS 20171004:
- self.install_drop(priority=2, in_port=in_port)
+ self.install_drop(priority=65535, in_port=in_port)
Regards,
ODC Noord.
Gerhard Muntingh
Albert Siersema
Paul Peereboom
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1734320/+subscriptions