← Back to team overview

kernel-packages team mailing list archive

[Bug 1118447] Re: Race condition with network and NFS mounts causes boottime hang

 

Unfortunately I'm not going to be able to test as requested in the short
term as the systems involved are in production. I will see if I can
replicate it on some virtual machines but again it could be a while
until this is possible.

While I also understand the preference for up to date BIOS it should be
used with care for removing a confirmed bug status when it can be seen
that one person is out of date. We have multiple people reporting the
problem and it should not be assumed that no-one has up to date BIOS.

Please also don't take the following as me trying to be rude, it is
merely meant to be an observation of the history of the bug:

In the past we have also been asked to update our kernels and at no
stage has this improved the situation. Is there anything to indicate
that it is likely to help on this occasion or that it is a kernel bug
and not a problem somewhere in the system startup scripts. It seems
quite possible that it is a bug where upstart believes that the network
is up when it's not.

I'm not an upstart expert so my observations here may be way off.
/etc/init/mountall-net.conf is set to start on net-device-up
/etc/init/network-interface.conf emits net-device-up but doesn't necessarily configure anything other than the loopback interface
/etc/init/networking.conf also emits net-device-up

a "grep network /var/log/boot.log" on my system (that uses the noauto option and mounts via /etc/rc.local) shows the following:
 * Starting configure network device used by iSCSI root                                            [ OK ]
 * Starting configure network device security                                                      [ OK ]
 * Starting configure network device security                                                      [ OK ]
 * Starting Mount network filesystems                                                              [ OK ]
 * Stopping Mount network filesystems                                                              [ OK ]
 * Starting configure network device                                                               [ OK ]
 * Starting Mount network filesystems                                                              [ OK ]
 * Stopping Mount network filesystems                                                              [ OK ]
 * Starting configure network device                                                               [ OK ]
 * Starting configure network device security                                                      [ OK ]
 * Starting configure virtual network devices                                                      [ OK ]
 * Stopping configure virtual network devices                                                      [ OK ]
 * Starting Serial port to network proxy ser2net                                                   [ OK ]

Why is it that mountall-net (Mount network filesystems) is happening before either network-interface (configure network device) or networking (configure virtual network devices)?
And what would be the implications of removing "emits net-device-up" from /etc/init/network-interface.conf so that we only get net-device-up once ifup -a has been run rather than just a network interface is up?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1118447

Title:
  Race condition with network and NFS mounts causes boottime hang

Status in “linux” package in Ubuntu:
  Confirmed
Status in “nfs-utils” package in Ubuntu:
  Incomplete

Bug description:
  I seem to experience a race condition during boot of my ubuntu 12.04 server: In approx. one of seven boots, the server hangs during bootup.
  This is what I see on the screen:

  After the line

   * Starting configure network device

  there is a short delay of about 1 second, then messages continue. I
  see

   * Starting Mount network filesystems [ OK ]
   * Starting set sysctls from /etc/sysctl.conf [ OK ]
   * Starting configure network device [ OK ]
   * Stopping Mount network filesystems [ OK ]
   * Stopping set sysctls from /etc/sysctl.conf [ OK ]
   * Starting Block the mounting event for NFS filesytems until statd is running [ OK ]
   * Stopping Block the mounting event for NFS filesytems until statd is running [ OK ]
   * Starting Block the mounting event for NFS filesytems until statd is running [ OK ]
   * Stopping Block the mounting event for NFS filesytems until statd is running [ OK ]

  The last messages repeats several times, and then the boot process hangs.
  In 6/7 of cases, I wait for a minute, and after that bootup continues.

  But in approx 1/7 cases, the system hangs at this point forever. The
  machine does not respond to CTRL-ALT-DEL, I have to reboot it using
  SysRq-Keys.

  WORKAROUND: Setting the NFS entries in fstab to "noauto" completely removes the problem:
  There is no timeout during boot, and no lockup any more. The machine boote smoothly with the NFS-shares unmounted. After the machine is up, we can manually mount the NFS-shares without a problem.

  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: linux-image-3.2.0-37-generic 3.2.0-37.58
  ProcVersionSignature: Ubuntu 3.2.0-37.58-generic 3.2.35
  Uname: Linux 3.2.0-37-generic x86_64
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
  AplayDevices: aplay: device_list:252: keine Soundkarten gefunden ...
  ApportVersion: 2.0.1-0ubuntu17.1
  Architecture: amd64
  ArecordDevices: arecord: device_list:252: keine Soundkarten gefunden ...
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D3p', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D2c', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
  CurrentDmesg: [   85.200104] lxcbr0: no IPv6 routers present
  Date: Thu Feb  7 15:50:40 2013
  HibernationDevice: RESUME=UUID=6c172536-57cc-4deb-867a-0718d572f23e
  IwConfig:
   lo        no wireless extensions.

   eth0      no wireless extensions.

   lxcbr0    no wireless extensions.
  MachineType: To be filled by O.E.M. To be filled by O.E.M.
  MarkForUpload: True
  ProcEnviron:
   LANGUAGE=de:en
   TERM=xterm
   PATH=(custom, no user)
   LANG=de_DE.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 radeondrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-37-generic root=/dev/mapper/lvmvg-root ro debug splash vt.handoff=7
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: Es läuft kein PulseAudio-Dienst oder nicht als Sessiondienst.
  RelatedPackageVersions:
   linux-restricted-modules-3.2.0-37-generic N/A
   linux-backports-modules-3.2.0-37-generic  N/A
   linux-firmware                            1.79.1
  RfKill:

  SourcePackage: linux
  UpgradeStatus: Upgraded to precise on 2012-04-28 (285 days ago)
  dmi.bios.date: 07/04/2012
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 0302
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: M5A97 EVO R2.0
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: Rev 1.xx
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 3
  dmi.chassis.vendor: To Be Filled By O.E.M.
  dmi.chassis.version: To Be Filled By O.E.M.
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0302:bd07/04/2012:svnTobefilledbyO.E.M.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnASUSTeKCOMPUTERINC.:rnM5A97EVOR2.0:rvrRev1.xx:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
  dmi.product.name: To be filled by O.E.M.
  dmi.product.version: To be filled by O.E.M.
  dmi.sys.vendor: To be filled by O.E.M.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1118447/+subscriptions


Follow ups