← Back to team overview

kernel-packages team mailing list archive

[Bug 1486670] Re: using ipsec, many connections result in no buffer space error

 

Short summary:

ipsec uses a struct dst_ops object per net-namespace (e.g. per
container), but does not correctly initialize each dst_ops object's
percpu counter.  This results in incorrect values for each net
namespace's dst_ops counter.

Full details:

ipsec uses xfrm objects, which contain dst objects, which are tracked
via the xfrm dst_ops struct in its percpu counter.  However, ipsec
creates a dst_ops object for every net namespace, not just the main net
namespace.  A dst_ops template is created, and its contents copied to
the dst_ops object for each new net namespace.  However, ipsec only
initializes the percpu counter in the dst_ops object once - for the
template.  The way percpu counters work is, the percpu counter object
has a main counter variable, and a pointer to the percpu counter
variables.  The percpu variables only go up to a small "batch" size (32
or so), at which point the percpu variable's count is moved to the main
counter variable.  However since multiple ipsec net namespaces are all
using different main counter variables but the same percpu counter
variables, the count from each percpu variable can be moved to a
different net namespace.  The result is, one net namespace (i.e.
container) may have its xfrm count decrease to below 0, while another
net namespace may have its xfrm count increase forever, and eventually
cause complete ipsec failure once the xfrm4_gc_thresh limit is exceeded.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486670

Title:
  using ipsec, many connections result in no buffer space error

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Precise:
  In Progress
Status in linux source package in Trusty:
  In Progress
Status in linux source package in Wily:
  In Progress

Bug description:
  Reproduction info:

  set up two LXC containers (although this probably isn't specific to
  LXC containers), and inside each setup ipsec with something similar
  to:

  conn nodeN
  aggressive=yes 
  authby=secret 
  auto=start 
  closeaction=restart 
  dpdaction=restart 
  esp=aes256-aes256gmac-modp1024 
  ike=aes256-sha512-modp1024 
  keyexchange=ikev2 
  left=10.0.3.145 
  leftid=10.0.3.145 
  lifetime=12h 
  reauth=no 
  right=10.0.3.199 
  type=transport 

  
  then repeatedly open connections to the peer, e.g.:

  while true; do ping -c1 10.0.3.199 ; sleep 0.1 ; done

  eventually, the connections will fail with:

  connect: No buffer space available

  the reproduction can be sped up by reducing the xfrm4_gc_thresh, e.g.:

  echo 5 > /proc/sys/net/ipv4/xfrm4_gc_thresh

  
  Once the error occurs, no more connections can be made to the peer (all fail with no buffer space available), however after a long period (e.g. overnight) the buffers will be cleaned up and connections can be made again.

  this happens even on the latest net-next kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1486670/+subscriptions


References