← Back to team overview

touch-packages team mailing list archive

[Bug 1509414] Re: pre-installed lxc in cloud image produces broken lxc (and later lxd) containers

 

** Summary changed:

- lxc postinst script checks available interfaces, can choose 
+ pre-installed lxc in cloud image produces broken lxc (and later lxd) containers

** Description changed:

  [Problem]
  The released wily image preinstalls lxc, which breaks the assumption that lxc's preinst packaging script makes:
  
  It inspects the network to try to pick a 10.0.N.0 network that isn't
  being used, with N starting at 3, so this appears to have picked
  10.0.3.0 when it was installed on whatever system was generating the
- image. This conflicts with the network that eth0 gets attached to when
- the image is brought up in a container, because it gets attached to the
- host's lxcbr0, which is using 10.0.3.x.
+ image.
  
- This affects LXC, and should affect LXD but doesn't currently, as the metadata used for lxd images is still pointing to a beta2 release.
- The easiest way to reproduce this is to use the ubuntu-cloud lxc template on a wily host:
+ When a container is started, it will dhcp on eth0 and get 10.0.3.X as
+ expected.  The problem comes when the lxc-net service that is already
+ installed in that container starts and configures *its* lxcbr0 with
+ 10.0.3.X.  The networking inside the container is broken at that point.
+ 
+ This affects LXC containers, and should affect LXD containers but
+ doesn't currently, as the metadata used for lxd images is still pointing
+ to a beta2 release (bug 1509390).
+ 
+ The easiest way to reproduce this is to use the ubuntu-cloud lxc
+ template on a wily host.
  
  [Test Case]
  
  1.) Verify expectation for each image
     - -disk1.img cloud image, check for file
     - -root.tar.xz image (used by lxd) and check for file
     - -root.tar.gz image (used by lxc)
  
     For each of those images, verify:
     a.) A cloud image should not have /etc/default/lxc-net
     b.) lxd should be installed (dpkg-query --show | grep lxd)
  
  2.) Start instance from updated image and start instance in lxc inside
     launch instance on openstack or kvm or other
     verify lxcbr0 bridge exists
     lxc-create -t ubuntu-cloud -n bugcheck -- --release=wily --stream=daily
     # wait until lxc-ls --fancy shows 'running'
     lxc-attach -n bugcheck wget http://ubuntu.com
  
  3.) Start instance from updated image and start instance in lxd inside
     launch instance on openstack or kvm or other
     verify lxcbr0 bridge exists
     lxd import-images ubuntu wily
     lxc launch ubuntu
     # wait some amount
     lxc attach bugcheck wget http://ubuntu.com
  
  [Regression Potentional]
  The highest chance for fallout is a change in the /16 network that is chosen conflicting with some existing service.
  
  [Other Info]
  Default apt install of lxc has always picked some 10.0.X.0/16 network to use for its lxcbr0 bridge.  That network (often 10.0.3.0/16) would then be unreachable from the host.  The same behavior occurs with libvirt-bin and many other such services.
  
  We are moving that logic to happen the first time that 'lxc-net' service
  starts.
  
  This means first boot for a cloud instance rather than cloud-image build
  time.
  
  [Work around]
  To patch / fix an existing cloud image to make lxc and lxd guests start simply change the config of /etc/default/lxc-net to have something other than 10.0.3.0.
  
- sudo sed -i '/^LXC.*10[.]0[.][0-9][.]/s/10.0.[0-9]./10.0.4./g' /etc/default/lxc-net && 
-     sudo service lxc-net stop && 
-     sudo service lxc-net start
+ sudo sed -i '/^LXC.*10[.]0[.][0-9][.]/s/10.0.[0-9]./10.0.4./g' /etc/default/lxc-net &&
+     sudo service lxc-net stop &&
+     sudo service lxc-net start

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to lxc in Ubuntu.
https://bugs.launchpad.net/bugs/1509414

Title:
  pre-installed lxc in cloud image produces broken lxc (and later lxd)
  containers

Status in lxc package in Ubuntu:
  Confirmed

Bug description:
  [Problem]
  The released wily image preinstalls lxc, which breaks the assumption that lxc's preinst packaging script makes:

  It inspects the network to try to pick a 10.0.N.0 network that isn't
  being used, with N starting at 3, so this appears to have picked
  10.0.3.0 when it was installed on whatever system was generating the
  image.

  When a container is started, it will dhcp on eth0 and get 10.0.3.X as
  expected.  The problem comes when the lxc-net service that is already
  installed in that container starts and configures *its* lxcbr0 with
  10.0.3.X.  The networking inside the container is broken at that
  point.

  This affects LXC containers, and should affect LXD containers but
  doesn't currently, as the metadata used for lxd images is still
  pointing to a beta2 release (bug 1509390).

  The easiest way to reproduce this is to use the ubuntu-cloud lxc
  template on a wily host.

  [Test Case]

  1.) Verify expectation for each image
     - -disk1.img cloud image, check for file
     - -root.tar.xz image (used by lxd) and check for file
     - -root.tar.gz image (used by lxc)

     For each of those images, verify:
     a.) A cloud image should not have /etc/default/lxc-net
     b.) lxd should be installed (dpkg-query --show | grep lxd)

  2.) Start instance from updated image and start instance in lxc inside
     launch instance on openstack or kvm or other
     verify lxcbr0 bridge exists
     lxc-create -t ubuntu-cloud -n bugcheck -- --release=wily --stream=daily
     # wait until lxc-ls --fancy shows 'running'
     lxc-attach -n bugcheck wget http://ubuntu.com

  3.) Start instance from updated image and start instance in lxd inside
     launch instance on openstack or kvm or other
     verify lxcbr0 bridge exists
     lxd import-images ubuntu wily
     lxc launch ubuntu
     # wait some amount
     lxc attach bugcheck wget http://ubuntu.com

  [Regression Potentional]
  The highest chance for fallout is a change in the /16 network that is chosen conflicting with some existing service.

  [Other Info]
  Default apt install of lxc has always picked some 10.0.X.0/16 network to use for its lxcbr0 bridge.  That network (often 10.0.3.0/16) would then be unreachable from the host.  The same behavior occurs with libvirt-bin and many other such services.

  We are moving that logic to happen the first time that 'lxc-net'
  service starts.

  This means first boot for a cloud instance rather than cloud-image
  build time.

  [Work around]
  To patch / fix an existing cloud image to make lxc and lxd guests start simply change the config of /etc/default/lxc-net to have something other than 10.0.3.0.

  sudo sed -i '/^LXC.*10[.]0[.][0-9][.]/s/10.0.[0-9]./10.0.4./g' /etc/default/lxc-net &&
      sudo service lxc-net stop &&
      sudo service lxc-net start

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1509414/+subscriptions


References