← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1664294] [NEW] Netlink solution not enough mature for Ocata

 

Public bug reported:

neutron-fwaas replaced conntrack-tools by netlink + pyroute2[1] in order
to delete faster conntrack entries when the rules associated to a
firewall (through a policy) are updated.


This change highlights many warnings:
1) the code readability should be improved:
** it embeds dead code/unused constants[2]
** it uses non-meaningful variable names[3] ("h", "ct")
** it uses hardcoded values instead of existing constants[3]
** it miss docstrings/comments for tricky parts[2][3]

2) the change tricky part[3] is untested

3) the code seems unhealthy, 
** nothing ensures that file descriptors/objects are correctly closed/destroyed if an exception is raised[3]
** pyroute2.netns.setns use[3] seems not (green)thread-safe. It seems a (green)thread can move the process in $netns2 when a concurrent greenthread is "working" in $netns1. It can be solved by a lock.

4) the change should highlight if and why nfct and libc C libraries used
in the change are eventlet-friendly.

5) the change uses pyroute2.netns.setns in [3] which moves the current process to a specific netns
** it seems that when we enter the netns, the process seems to never go back in the root netns[9].  Is it an acceptable limitation?
** When the process is in a netns, every network action is done in the netns including rpc calls, remote sylog which won't work because they should be done in the root namespace.


1) => we can live on the short term with it (easy to address)
2) => we can address it (costly)
3) => we can address it (easy to solve)
4) => i don't know how to get an answer to this question
5) => it seems to imply that we have to dedicate specific(s) process(es) moving between netns to list/kill/flush_entries like as it's done in oslo.rootwrap daemon mode (very costly by required)


These warnings are based on my understanding of the netlink feature, perhaps i miss something.


[1] https://review.openstack.org/389654
[2] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/common/netlink_constants.py
[3] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/netlink_lib.py
[9] Code used to test pyroute2:
import os, pyroute2, pyroute2.netns

root_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()}
fd = pyroute2.netns.setns('taz')
ns_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()}
os.close(fd)
last_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()}

if root_ifs == ns_ifs:
    print 'Add a new interface in root netns with ip tuntap a mode tap'
elif last_ifs == root_ifs:
    print 'OK: we ended in root netns'
elif root_ifs != last_ifs == ns_ifs:
    print 'KO: we ended in taz netns'
else:
    print 'Something is wrong'

** Affects: neutron
     Importance: Undecided
     Assignee: Cedric Brandily (cbrandily)
         Status: New


** Tags: fwaas ocata-rc-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1664294

Title:
  Netlink solution not enough mature for Ocata

Status in neutron:
  New

Bug description:
  neutron-fwaas replaced conntrack-tools by netlink + pyroute2[1] in
  order to delete faster conntrack entries when the rules associated to
  a firewall (through a policy) are updated.

  
  This change highlights many warnings:
  1) the code readability should be improved:
  ** it embeds dead code/unused constants[2]
  ** it uses non-meaningful variable names[3] ("h", "ct")
  ** it uses hardcoded values instead of existing constants[3]
  ** it miss docstrings/comments for tricky parts[2][3]

  2) the change tricky part[3] is untested

  3) the code seems unhealthy, 
  ** nothing ensures that file descriptors/objects are correctly closed/destroyed if an exception is raised[3]
  ** pyroute2.netns.setns use[3] seems not (green)thread-safe. It seems a (green)thread can move the process in $netns2 when a concurrent greenthread is "working" in $netns1. It can be solved by a lock.

  4) the change should highlight if and why nfct and libc C libraries
  used in the change are eventlet-friendly.

  5) the change uses pyroute2.netns.setns in [3] which moves the current process to a specific netns
  ** it seems that when we enter the netns, the process seems to never go back in the root netns[9].  Is it an acceptable limitation?
  ** When the process is in a netns, every network action is done in the netns including rpc calls, remote sylog which won't work because they should be done in the root namespace.

  
  1) => we can live on the short term with it (easy to address)
  2) => we can address it (costly)
  3) => we can address it (easy to solve)
  4) => i don't know how to get an answer to this question
  5) => it seems to imply that we have to dedicate specific(s) process(es) moving between netns to list/kill/flush_entries like as it's done in oslo.rootwrap daemon mode (very costly by required)

  
  These warnings are based on my understanding of the netlink feature, perhaps i miss something.


  
  [1] https://review.openstack.org/389654
  [2] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/common/netlink_constants.py
  [3] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/netlink_lib.py
  [9] Code used to test pyroute2:
  import os, pyroute2, pyroute2.netns

  root_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()}
  fd = pyroute2.netns.setns('taz')
  ns_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()}
  os.close(fd)
  last_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()}

  if root_ifs == ns_ifs:
      print 'Add a new interface in root netns with ip tuntap a mode tap'
  elif last_ifs == root_ifs:
      print 'OK: we ended in root netns'
  elif root_ifs != last_ifs == ns_ifs:
      print 'KO: we ended in taz netns'
  else:
      print 'Something is wrong'

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1664294/+subscriptions


Follow ups