← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1823818] Re: Memory leak in some neutron agents

 

It looks like this is really same issue as in https://bugzilla.redhat.com/show_bug.cgi?id=1667007 so it's not direclty issue in neutron but in openvswitch.
I will then mark it as invalid for neutron but feel free to change it if that would be different issue.

** Tags added: ovs

** Bug watch added: Red Hat Bugzilla #1667007
   https://bugzilla.redhat.com/show_bug.cgi?id=1667007

** Changed in: neutron
       Status: New => Invalid

** Changed in: neutron
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1823818

Title:
  Memory leak in some  neutron agents

Status in kolla:
  Confirmed
Status in neutron:
  Invalid

Bug description:
  We have an OpenStack deployment using rocky release. We have seen a
  memory leak issue in some neutron agents twice in our environment
  since it was first deployed this Jan.

  Below are some of the commands we ran to identify the issue and their
  corresponding output:

  This was on one of the compute nodes:
  -----------------------------------------------
  [root@c1s4 ~]#  ps aux --sort -rss|head -n1

  USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME
  COMMAND

  42435    48229  3.5 73.1 98841060 96323252 pts/13 S+ 2018 1881:25 /usr/bin/python2 /usr/bin/neutron-openvswitch-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini
  -----------------------------------------------

  And this was on one of the controller nodes:
  -----------------------------------------------
  [root@r1 neutron]# ps aux --sort -rss|head

  USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME
  COMMAND

  42435    30940  3.1 48.6 68596320 64144784 pts/37 S+ Jan08 588:26
  /usr/bin/python2 /usr/bin/neutron-lbaasv2-agent --config-file
  /etc/neutron/neutron.conf --config-file /etc/neutron/lbaas_agent.ini
  --config-file /etc/neutron/neutron_lbaas.conf

  42435    20902  2.8 26.1 36055484 34408952 pts/35 S+ Jan08 525:12
  /usr/bin/python2 /usr/bin/neutron-dhcp-agent --config-file
  /etc/neutron/neutron.conf --config-file /etc/neutron/dhcp_agent.ini

  42434    34199  7.1  6.0 39420516 8033480 pts/11 Sl+ 2018 3620:08
  /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql/ --plugin-
  dir=/usr/lib64/mysql/plugin
  --wsrep_provider=/usr/lib64/galera/libgalera_smm.so --wsrep_on=ON
  --log-error=/var/log/kolla/mariadb/mariadb.log --pid-
  file=/var/lib/mysql/mariadb.pid --port=3306
  --wsrep_start_position=0809f452-0251-11e9-8e60-6ad108d9be7b:0

  42435     8327  2.6  2.2 3546004 3001772 pts/10 S+  Jan17 152:04
  /usr/bin/python2 /usr/bin/neutron-l3-agent --config-file
  /etc/neutron/neutron.conf --config-file
  /etc/neutron/neutron_vpnaas.conf --config-file
  /etc/neutron/l3_agent.ini --config-file /etc/neutron/fwaas_driver.ini

  42435    40171  2.6  2.1 3893480 2840852 pts/19 S+  Jan16 190:54
  /usr/bin/python2 /usr/bin/neutron-openvswitch-agent --config-file
  /etc/neutron/neutron.conf --config-file
  /etc/neutron/plugins/ml2/ml2_conf.ini

  root     42430  3.1  0.3 4412216 495492 pts/29 SLl+ Jan16 231:20 /usr/sbin/ovs-vswitchd unix:/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --log-file=/var/log/kolla/openvswitch/ovs-vswitchd.log
  ---------------------------------------------

  When it happened, we saw a lot of 'OSError: [Errno 12] Cannot allocate
  memory' ERRORs in different neutron-* logs, because there were no free
  mem left. However, we don't know yet what had triggered the memory
  leakage.

  Here is our globals.yml:
  ---------------------------------------------
  [root@r1 kolla]# cat globals.yml |grep -v "^#"|tr -s "\n"
  ---
  openstack_release: "rocky"
  kolla_internal_vip_address: "172.21.69.22"
  enable_barbican: "yes"
  enable_ceph: "yes"
  enable_ceph_mds: "yes"
  enable_ceph_rgw: "yes"
  enable_cinder: "yes"
  enable_neutron_lbaas: "yes"
  enable_neutron_fwaas: "yes"
  enable_neutron_agent_ha: "yes"
  enable_ceph_rgw_keystone: "yes"
  ceph_pool_pg_num: 16
  ceph_pool_pgp_num: 16
  ceph_osd_store_type: "xfs"
  glance_backend_ceph: "yes"
  glance_backend_file: "no"
  glance_enable_rolling_upgrade: "no"
  ironic_dnsmasq_dhcp_range:
  tempest_image_id:
  tempest_flavor_ref_id:
  tempest_public_network_id:
  tempest_floating_network_name:
  -----------------------------------------------

  
  I did some search on google and found this ovs bug is highly related https://bugzilla.redhat.com/show_bug.cgi?id=1667007

  I am not sure if the fix has been included in the latest Rocky kolla
  images?

  
  Best regards,

  Lei

To manage notifications about this bug go to:
https://bugs.launchpad.net/kolla/+bug/1823818/+subscriptions