← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1588731] [NEW] net_helpers.async_ping() is unreliable

 

Public bug reported:

Current implementation of net_helpers.async_ping() is broken due its
usage of -c parameter of ping and expectation that if some of the ICMP
replies does not arrive, RuntimeException would be thrown. Linux ping
works in the way that if at least one reply is received from any number
of ICMP ping requests, result code is 0 (success) and no
RuntimeException is thrown.

Shell reproducer of current net_helpers.async_ping() behaviour:

ip a add 10.20.30.5/24 dev lo ; \
( sleep 0.5 ; ip a del 10.20.30.5/24 dev lo ; sleep 1 ; ip a add 10.20.30.5/24 dev lo ; sleep 2 ; ip a del 10.20.30.5/24 dev lo ) & \
ping 10.20.30.5 -W 1 -c 3 ; \
echo "ping return code = $?"

The return code is always 0 although one of the ICMP replies is lost.

Man page suggests to use -w parameter. However this does not help: When
using -w parameter, it is still possible that one ICMP reply is missed
(even when using -c) while ping resulting in 0: e.g. "ping -c 3 -w 3"
would send _four_ icmp requests and receive three responses if e.g.
second response is missed and the other responses would be fast enough.
Because three responses would be received, ping would return 0 status
code even though there was a single packet loss, and that would lead to
false conclusion that ping test passes correctly.

Shell reproducer of net_helpers.async_ping() behaviour with -w:

ip a add 10.20.30.5/24 dev lo ; \
( sleep 0.5 ; ip a del 10.20.30.5/24 dev lo ; sleep 1 ; ip a add 10.20.30.5/24 dev lo ; sleep 2 ; ip a del 10.20.30.5/24 dev lo ) & \
ping 10.20.30.5 -W 1 -c 3 -w 3 ; \
echo "ping return code = $?"

The return code is 0 and 1 roughly at similar rate, hence using -w is
not an option for reliable net_helpers.async_ping().

Hence net_helpers.async_ping() needs to use ping only for a single ICMP
request/reply test, only in that case the result code is reliable.
Multiple ICMP requests/replies need to be handled in the code.

This happens with ping at least from iputils-s20121221 and
iputils-s20140519. It seems a ping issue as one would expect that -c
would limit the number of ICMP requests sent. Yet neutron tests should
account for this behaviour.

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: functional-tests

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1588731

Title:
  net_helpers.async_ping() is unreliable

Status in neutron:
  New

Bug description:
  Current implementation of net_helpers.async_ping() is broken due its
  usage of -c parameter of ping and expectation that if some of the ICMP
  replies does not arrive, RuntimeException would be thrown. Linux ping
  works in the way that if at least one reply is received from any
  number of ICMP ping requests, result code is 0 (success) and no
  RuntimeException is thrown.

  Shell reproducer of current net_helpers.async_ping() behaviour:

  ip a add 10.20.30.5/24 dev lo ; \
  ( sleep 0.5 ; ip a del 10.20.30.5/24 dev lo ; sleep 1 ; ip a add 10.20.30.5/24 dev lo ; sleep 2 ; ip a del 10.20.30.5/24 dev lo ) & \
  ping 10.20.30.5 -W 1 -c 3 ; \
  echo "ping return code = $?"

  The return code is always 0 although one of the ICMP replies is lost.

  Man page suggests to use -w parameter. However this does not help:
  When using -w parameter, it is still possible that one ICMP reply is
  missed (even when using -c) while ping resulting in 0: e.g. "ping -c 3
  -w 3" would send _four_ icmp requests and receive three responses if
  e.g. second response is missed and the other responses would be fast
  enough. Because three responses would be received, ping would return 0
  status code even though there was a single packet loss, and that would
  lead to false conclusion that ping test passes correctly.

  Shell reproducer of net_helpers.async_ping() behaviour with -w:

  ip a add 10.20.30.5/24 dev lo ; \
  ( sleep 0.5 ; ip a del 10.20.30.5/24 dev lo ; sleep 1 ; ip a add 10.20.30.5/24 dev lo ; sleep 2 ; ip a del 10.20.30.5/24 dev lo ) & \
  ping 10.20.30.5 -W 1 -c 3 -w 3 ; \
  echo "ping return code = $?"

  The return code is 0 and 1 roughly at similar rate, hence using -w is
  not an option for reliable net_helpers.async_ping().

  Hence net_helpers.async_ping() needs to use ping only for a single
  ICMP request/reply test, only in that case the result code is
  reliable. Multiple ICMP requests/replies need to be handled in the
  code.

  This happens with ping at least from iputils-s20121221 and
  iputils-s20140519. It seems a ping issue as one would expect that -c
  would limit the number of ICMP requests sent. Yet neutron tests should
  account for this behaviour.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1588731/+subscriptions


Follow ups