yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #61817
[Bug 1667755] [NEW] Default scope rules added to router may drop traffic unexpectedly
Public bug reported:
Release: OpenStack-Ansible 13.3.4 (Mitaka)
Scenario:
Neutron routers are connected to single provider network and single
tenant network. Floating IPs are *not* used, and SNAT is disabled on the
router:
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| availability_zone_hints | |
| availability_zones | nova |
| description | |
| distributed | False |
| external_gateway_info | {"network_id": "ce830329-4133-41fe-868f-698cc761e247", "enable_snat": false, "external_fixed_ips": [{"subnet_id": "cf34a5c3-5d26 |
| | -449f-b22e-2e3fdd69f262", "ip_address": "10.152.114.39"}]} |
| ha | False |
| id | c965e7a1-98c0-4d5e-8dcb-cfafc2667ee1 |
| name | RTR |
| routes | |
| status | ACTIVE |
| tenant_id | 2ed1712187674c64acae83948e5b1928 |
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
Upstream routes exist that route tenant network traffic to the qg
interface of the routes (static, not BGP - yet).
In some cases, we have found that inbound/outbound traffic is getting
dropped within the Neutron qrouter namespace. Comparing to a working
router, we have found some differences in iptables:
Working router:
*mangle
-A neutron-l3-agent-scope -i qr-3dd65e85-f2 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-scope -i qg-2f55db22-5b -j MARK --set-xmark 0x4010000/0xffff0000
*filter
-A neutron-l3-agent-scope -o qr-3dd65e85-f2 -m mark ! --mark 0x4010000/0xffff0000 -j DROP
-A neutron-l3-agent-scope -o qg-2f55db22-5b -m mark ! --mark 0x4010000/0xffff0000 -j DROP
Non-working router:
*mangle
-A neutron-l3-agent-scope -i qg-e3f65cf1-29 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-scope -i qr-125a3dc5-e3 -j MARK --set-xmark 0x4000000/0xffff0000
*filter
-A neutron-l3-agent-scope -o qg-e3f65cf1-29 -m mark ! --mark 0x4010000/0xffff0000 -j DROP
-A neutron-l3-agent-scope -o qr-125a3dc5-e3 -m mark ! --mark 0x4000000/0xffff0000 -j DROP
Our working theory is that the marks in filter rules on the non-working
router are incorrectly set - traffic ingress to the qg interface is
being marked as x401, and the egress filter on the qr interface is
checking for x400. We were able to test this theory by swapping the
marks on those two filter rules and observed that inbound/outbound
traffic was working properly.
In the case of the working router, the mark set in the mangle rules is
the same (x401 for both), so the filter rules work fine.
We are not sure at this time how the mark is determined, and while we
can replicate the issue on new routers in the environment, we are unable
to replicate this behavior in other environments at this time.
Please let us know if you need any additional info.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1667755
Title:
Default scope rules added to router may drop traffic unexpectedly
Status in neutron:
New
Bug description:
Release: OpenStack-Ansible 13.3.4 (Mitaka)
Scenario:
Neutron routers are connected to single provider network and single
tenant network. Floating IPs are *not* used, and SNAT is disabled on
the router:
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| availability_zone_hints | |
| availability_zones | nova |
| description | |
| distributed | False |
| external_gateway_info | {"network_id": "ce830329-4133-41fe-868f-698cc761e247", "enable_snat": false, "external_fixed_ips": [{"subnet_id": "cf34a5c3-5d26 |
| | -449f-b22e-2e3fdd69f262", "ip_address": "10.152.114.39"}]} |
| ha | False |
| id | c965e7a1-98c0-4d5e-8dcb-cfafc2667ee1 |
| name | RTR |
| routes | |
| status | ACTIVE |
| tenant_id | 2ed1712187674c64acae83948e5b1928 |
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
Upstream routes exist that route tenant network traffic to the qg
interface of the routes (static, not BGP - yet).
In some cases, we have found that inbound/outbound traffic is getting
dropped within the Neutron qrouter namespace. Comparing to a working
router, we have found some differences in iptables:
Working router:
*mangle
-A neutron-l3-agent-scope -i qr-3dd65e85-f2 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-scope -i qg-2f55db22-5b -j MARK --set-xmark 0x4010000/0xffff0000
*filter
-A neutron-l3-agent-scope -o qr-3dd65e85-f2 -m mark ! --mark 0x4010000/0xffff0000 -j DROP
-A neutron-l3-agent-scope -o qg-2f55db22-5b -m mark ! --mark 0x4010000/0xffff0000 -j DROP
Non-working router:
*mangle
-A neutron-l3-agent-scope -i qg-e3f65cf1-29 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-scope -i qr-125a3dc5-e3 -j MARK --set-xmark 0x4000000/0xffff0000
*filter
-A neutron-l3-agent-scope -o qg-e3f65cf1-29 -m mark ! --mark 0x4010000/0xffff0000 -j DROP
-A neutron-l3-agent-scope -o qr-125a3dc5-e3 -m mark ! --mark 0x4000000/0xffff0000 -j DROP
Our working theory is that the marks in filter rules on the non-
working router are incorrectly set - traffic ingress to the qg
interface is being marked as x401, and the egress filter on the qr
interface is checking for x400. We were able to test this theory by
swapping the marks on those two filter rules and observed that
inbound/outbound traffic was working properly.
In the case of the working router, the mark set in the mangle rules is
the same (x401 for both), so the filter rules work fine.
We are not sure at this time how the mark is determined, and while we
can replicate the issue on new routers in the environment, we are
unable to replicate this behavior in other environments at this time.
Please let us know if you need any additional info.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1667755/+subscriptions
Follow ups