← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1907175] [NEW] intermittently ALL VM's floating IP connection is disconnected, and can be reconnected after 5-6 minutes

 

Public bug reported:

Current configuration: 49node centos 7.8(Kernel version =  3.10.0-1127.el7.x86_64)
                       kolla-ansible 9.2.1 (openvswitch - 2.12.0 / neutron-server 15.1.0)

Phenomenon: The floating IP connection is disconnected, and the connection becomes possible again after 5-6 minutes. Occurs by all  vm on nodes.
The internal ip connection is not disconnected, and if openvswitch_vswitchd is restarted in case of failure, the problem is solved.
The public network, physnet1 (172.29.75.0~172.29.84.0), is tied in LACP(Bond_mode =4 )
 mode by VLAN, and the TENANT NETWORK is composed of vxlan. (Use DVR)
As a result of the ping tcpdump test, the network sends a ping to the node with the vm, but the vm does not respond.

=== ml2_conf.ini ============================================================
[root@2020c5lut006 neutron-server]# cat ml2_conf.ini
[ml2]
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
mechanism_drivers = openvswitch,baremetal,l2population
extension_drivers = qos,port_security
path_mtu = 9000

[ml2_type_vlan]
network_vlan_ranges = physnet1

[ml2_type_flat]
flat_networks = *

[ml2_type_vxlan]
vni_ranges = 1:1000

[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

[agent]
tunnel_types = vxlan
l2_population = true
arp_responder = true
enable_distributed_routing = True
extensions = qos

[ovs]
bridge_mappings = physnet1:br-ex,physnet2:br-cephfs,physnet3:br-api
datapath_type = system
ovsdb_connection = tcp:127.0.0.1:6640
local_ip = 20.21.2.101

==neutron.conf ====================================================================
[root@2020c5lut006 neutron-server]# cat neutron.conf
[DEFAULT]
debug = False
log_dir = /var/log/kolla/neutron
use_stderr = False
bind_host = 20.21.1.101
bind_port = 9696
api_paste_config = /usr/share/neutron/api-paste.ini
endpoint_type = internalURL
api_workers = 5
metadata_workers = 5
rpc_workers = 3
rpc_state_report_workers = 3
metadata_proxy_socket = /var/lib/neutron/kolla/metadata_proxy
interface_driver = openvswitch
allow_overlapping_ips = true
core_plugin = ml2
service_plugins = firewall_v2,qos,router
dhcp_agents_per_network = 2
l3_ha = true
max_l3_agents_per_router = 3
transport_url = rabbit://openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.101:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.102:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.103:5672//
router_distributed = True
ipam_driver = internal
global_physnet_mtu = 9000

[nova]
auth_url = http://20.21.1.100:35357
auth_type = password
project_domain_id = default
user_domain_id = default
region_name = RegionOne
project_name = service
username = nova
password = rChwHtVHMqLK3AHRkKfZ7rxiQ74Am8EJHWvbEyQt
endpoint_type = internal

[oslo_middleware]
enable_proxy_headers_parsing = True

[oslo_concurrency]
lock_path = /var/lib/neutron/tmp

[agent]
root_helper = sudo neutron-rootwrap /etc/neutron/rootwrap.conf

[database]
connection = mysql+pymysql://neutron:PZl2BQm7LesapA6Ks9lqOuUc6DU4kRHeSWwPNvH1@20.21.1.100:3306/neutron
max_retries = -1

[keystone_authtoken]
www_authenticate_uri = http://20.21.1.100:5000
auth_url = http://20.21.1.100:35357
auth_type = password
project_domain_id = default
user_domain_id = default
project_name = service
username = neutron
password = XjxBaFwek0aaKj0rLaqeUXqfp7lrNk5sdkIFGAeE
memcache_security_strategy = ENCRYPT
memcache_secret_key = w6eOcER3TlZzidSL7wjea2rnbMWGUlV7BiO3ls3J
memcached_servers = 20.21.1.101:11211,20.21.1.102:11211,20.21.1.103:11211

[oslo_messaging_notifications]
transport_url = rabbit://openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.101:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.102:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.103:5672//
driver = noop

[octavia]
base_url = http://20.21.1.100:9876

[placement]
auth_type = password
auth_url = http://20.21.1.100:35357
username = placement
password = s1VxNvJeh8CDOjeqa6hi8eF0QhQdDBp12SJdyfll
user_domain_name = Default
project_name = service
project_domain_name = Default
os_region_name = RegionOne
os_interface = internal

[privsep]
helper_command = sudo neutron-rootwrap /etc/neutron/rootwrap.conf privsep-helper

====================================================================

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1907175

Title:
   intermittently ALL VM's floating IP connection is disconnected, and
  can be reconnected after 5-6 minutes

Status in neutron:
  New

Bug description:
  Current configuration: 49node centos 7.8(Kernel version =  3.10.0-1127.el7.x86_64)
                         kolla-ansible 9.2.1 (openvswitch - 2.12.0 / neutron-server 15.1.0)

  Phenomenon: The floating IP connection is disconnected, and the connection becomes possible again after 5-6 minutes. Occurs by all  vm on nodes.
  The internal ip connection is not disconnected, and if openvswitch_vswitchd is restarted in case of failure, the problem is solved.
  The public network, physnet1 (172.29.75.0~172.29.84.0), is tied in LACP(Bond_mode =4 )
   mode by VLAN, and the TENANT NETWORK is composed of vxlan. (Use DVR)
  As a result of the ping tcpdump test, the network sends a ping to the node with the vm, but the vm does not respond.

  === ml2_conf.ini ============================================================
  [root@2020c5lut006 neutron-server]# cat ml2_conf.ini
  [ml2]
  type_drivers = flat,vlan,vxlan
  tenant_network_types = vxlan
  mechanism_drivers = openvswitch,baremetal,l2population
  extension_drivers = qos,port_security
  path_mtu = 9000

  [ml2_type_vlan]
  network_vlan_ranges = physnet1

  [ml2_type_flat]
  flat_networks = *

  [ml2_type_vxlan]
  vni_ranges = 1:1000

  [securitygroup]
  firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

  [agent]
  tunnel_types = vxlan
  l2_population = true
  arp_responder = true
  enable_distributed_routing = True
  extensions = qos

  [ovs]
  bridge_mappings = physnet1:br-ex,physnet2:br-cephfs,physnet3:br-api
  datapath_type = system
  ovsdb_connection = tcp:127.0.0.1:6640
  local_ip = 20.21.2.101

  ==neutron.conf ====================================================================
  [root@2020c5lut006 neutron-server]# cat neutron.conf
  [DEFAULT]
  debug = False
  log_dir = /var/log/kolla/neutron
  use_stderr = False
  bind_host = 20.21.1.101
  bind_port = 9696
  api_paste_config = /usr/share/neutron/api-paste.ini
  endpoint_type = internalURL
  api_workers = 5
  metadata_workers = 5
  rpc_workers = 3
  rpc_state_report_workers = 3
  metadata_proxy_socket = /var/lib/neutron/kolla/metadata_proxy
  interface_driver = openvswitch
  allow_overlapping_ips = true
  core_plugin = ml2
  service_plugins = firewall_v2,qos,router
  dhcp_agents_per_network = 2
  l3_ha = true
  max_l3_agents_per_router = 3
  transport_url = rabbit://openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.101:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.102:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.103:5672//
  router_distributed = True
  ipam_driver = internal
  global_physnet_mtu = 9000

  [nova]
  auth_url = http://20.21.1.100:35357
  auth_type = password
  project_domain_id = default
  user_domain_id = default
  region_name = RegionOne
  project_name = service
  username = nova
  password = rChwHtVHMqLK3AHRkKfZ7rxiQ74Am8EJHWvbEyQt
  endpoint_type = internal

  [oslo_middleware]
  enable_proxy_headers_parsing = True

  [oslo_concurrency]
  lock_path = /var/lib/neutron/tmp

  [agent]
  root_helper = sudo neutron-rootwrap /etc/neutron/rootwrap.conf

  [database]
  connection = mysql+pymysql://neutron:PZl2BQm7LesapA6Ks9lqOuUc6DU4kRHeSWwPNvH1@20.21.1.100:3306/neutron
  max_retries = -1

  [keystone_authtoken]
  www_authenticate_uri = http://20.21.1.100:5000
  auth_url = http://20.21.1.100:35357
  auth_type = password
  project_domain_id = default
  user_domain_id = default
  project_name = service
  username = neutron
  password = XjxBaFwek0aaKj0rLaqeUXqfp7lrNk5sdkIFGAeE
  memcache_security_strategy = ENCRYPT
  memcache_secret_key = w6eOcER3TlZzidSL7wjea2rnbMWGUlV7BiO3ls3J
  memcached_servers = 20.21.1.101:11211,20.21.1.102:11211,20.21.1.103:11211

  [oslo_messaging_notifications]
  transport_url = rabbit://openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.101:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.102:5672,openstack:mMZl0hvZ5KSGQgfqtAbbBRkpMfEbzIKjDUHu8NSd@20.21.1.103:5672//
  driver = noop

  [octavia]
  base_url = http://20.21.1.100:9876

  [placement]
  auth_type = password
  auth_url = http://20.21.1.100:35357
  username = placement
  password = s1VxNvJeh8CDOjeqa6hi8eF0QhQdDBp12SJdyfll
  user_domain_name = Default
  project_name = service
  project_domain_name = Default
  os_region_name = RegionOne
  os_interface = internal

  [privsep]
  helper_command = sudo neutron-rootwrap /etc/neutron/rootwrap.conf privsep-helper

  ====================================================================

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1907175/+subscriptions


Follow ups