← Back to team overview

touch-packages team mailing list archive

[Bug 1337873] Re: Precise, Trusty, Utopic - ifupdown initialization problems caused by race condition

 

This bug is being fixed upstream together with the developer. I'll
provide the fix here as soon as it gets accepted upstream:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=753755

Thank you.

** Patch removed: "ifupdown_0.7.47.2ubuntu4.2~lp1337873.diff"
   https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1337873/+attachment/4145680/+files/ifupdown_0.7.47.2ubuntu4.2%7Elp1337873.diff

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to ifupdown in Ubuntu.
https://bugs.launchpad.net/bugs/1337873

Title:
  Precise, Trusty, Utopic - ifupdown initialization problems caused by
  race condition

Status in “ifupdown” package in Ubuntu:
  In Progress
Status in “ifupdown” package in Debian:
  New

Bug description:
  It was brought to my attention (by others) that ifupdown runs into
  race conditions on some specific cases.

  [Impact]

  When trying to deploy many servers at once (higher chances of
  happening) or from time-to-time, like any other intermittent race-
  condition. Interfaces are not brought up like they should and this has
  a big impact for servers that cannot rely on network start scripts.

  The problem is caused by a race condition when init(upstart) starts up
  network interfaces in parallel.

  [Test Case]

  Use attached script to reproduce the error (it might take some hours,
  in a single virtual machine, for the error to occur).

  (example 1)

  *** sequence to trigger race-condition ***

  (a) ifup eth0                     (b) ifup -a for eth0
  -----------------------------------------------------------------
  1-1. Lock ifstate.lock file.
                                    1-1. Wait for locking ifstate.lock
                                        file.
  1-2. Read ifstate file to check
       the target NIC.
  1-3. close(=release) ifstate.lock
       file.
  1-4. Judge that the target NIC
       isn't processed.
                                    1-2. Read ifstate file to check
                                         the target NIC.
                                    1-3. close(=release) ifstate.lock
                                         file.
                                    1-4. Judge that the target NIC
                                         isn't processed.
  2. Lock and update ifstate file.
     Release the lock.
                                    2. Lock and update ifstate file.
                                       Release the lock.

  (example 2)

  Bonding device using eth0.
  ifenslave for eth0 is also executed in parallel, eth0 remains down.

  *** sequence to trigger race-condition ***

  (a) ifenslave of eth0             (b) ifenslave of eth0
  ------------------------------------------------------------------
  3. Execute ifenslave of eth0.      3. Execute ifenslave of eth0.
  4. Link down the target NIC.
  5. Write NIC id to
     /sys/class/net/bond0/bonding
     /slaves then NIC gets up
                                    4. Link down the target NIC.
                                    5. Fails to write NIC id to
                                       /sys/class/net/bond0/bonding/
                                       slaves it is already written.

  (example 3)

  bonding is not set to active-backup as defined in config file: When
  the init(upstart) executes "if-pre-up.d/ifenslave" script and "if-pre-
  up.d/vlan" script for bond0 device in parallel, the "if-pre-
  up.d/ifenslave" script fails to change the bonding mode with a error
  message, "bonding: unable to update mode of bond0 because interface is
  up.".

  *** sequence to trigger race-condition ***

  (a)ifup bond0                     (b)ifup -a
  -----------------------------------------------------------------------
  1. Update statefile about bond0.
                                    1. Does nothing about bond0
                                       because statefile is already
                                       updated about it.
  2. ifenslave::setup_master()
     sysfs_change_down mode 1
     and link down bond0.
                                    2. Link up bond0 by the vlan
                                       script on the processing
                                       for linking up bond0.201(*1).
  3. "echo 1 > .../mode" fails.

  [ /etc/network/if-pre-up.d/vlan ]

  46 if [ -n "$IF_VLAN_RAW_DEVICE" ] && [ ! -d /sys/class/net/$IFACE ]; then
  47     if [ ! -x /sbin/vconfig ]; then
  48         exit 0
  49     fi
  50     if ! ip link show dev "$IF_VLAN_RAW_DEVICE" > /dev/null; then
  51         echo "$IF_VLAN_RAW_DEVICE does not exist, unable to create $IFACE"
  52         exit 1
  53     fi
  54     ip link set up dev $IF_VLAN_RAW_DEVICE     <-- (*1).
  55     vconfig add $IF_VLAN_RAW_DEVICE $VLANID
  56 fi

  
  [Regression Potential]

   * Attaching proposed patch (for upstream as well) and describing
  potential later on today.

  [Other Info]

  Example: [ /etc/network/interfaces ]

  auto lo
  iface lo inet loopback

  auto eth0
  iface eth0 inet manual
   bond-master bond0

  auto eth1
  iface eth1 inet manual
   bond-master bond0

  auto bond0
  iface bond0 inet dhcp
   bond-slaves eth0 eth1
   hwaddress 11:22:33:44:55:66
   bond-primary eth0
   bond-mode 1
   bond-miimon 100
   bond-updelay 200
   bond-downdelay 200

  auto bond0.201
  iface bond0.201 inet dhcp
   hwaddress 11:22:33:44:55:66
   vlan-raw-device bond0
  ...

  auto bond0.205
  iface bond0.205 inet dhcp
   hwaddress 11:22:33:44:55:66
   vlan-raw-device bond0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1337873/+subscriptions