← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1802073] Re: No network in AWS (EC-Classic) after stopping and starting instance

 

** Changed in: cloud-init (Ubuntu)
       Status: Confirmed => Fix Committed

** Also affects: cloud-init (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Also affects: cloud-init (Ubuntu Cosmic)
   Importance: Undecided
       Status: New

** Also affects: cloud-init (Ubuntu Bionic)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1802073

Title:
  No network in AWS (EC-Classic) after stopping and starting instance

Status in cloud-init package in Ubuntu:
  Fix Committed
Status in cloud-init source package in Xenial:
  New
Status in cloud-init source package in Bionic:
  New
Status in cloud-init source package in Cosmic:
  New

Bug description:
  I don't know is this cloud-init or netplan or what, but this is not
  good.

  Background:
  # lsb_release -rd
  Description:    Ubuntu 18.04.1 LTS
  Release:        18.04
  # apt-cache policy cloud-init
  cloud-init:
    Installed: 18.4-0ubuntu1~18.04.1
    Candidate: 18.4-0ubuntu1~18.04.1
    Version table:
   *** 18.4-0ubuntu1~18.04.1 500
          500 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
          100 /var/lib/dpkg/status
       18.2-14-g6d48d265-0ubuntu1 500
          500 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

  
  1. Get newest image to use

  $ aws --region eu-west-1 ec2 describe-images --owners 099720109477
  --filters Name=root-device-type,Values=ebs
  Name=architecture,Values=x86_64 Name=name,Values='*hvm-ssd/ubuntu-
  bionic-18.04*' --query 'sort_by(Images, &Name)[-1].ImageId'

  "ami-08596fdd2d5b64915"

  2. Start instance to EC2-Classic with that image.

  3. Try to SSH. Everything is ok.

  # cat /var/log/cloud-init-output.log
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init-local' at Wed, 07 Nov 2018 08:12:16 +0000. Up 10.51 seconds.
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init' at Wed, 07 Nov 2018 08:12:21 +0000. Up 15.50 seconds.
  ci-info: +++++++++++++++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++++++++++++++
  ci-info: +--------+------+-----------------------------+-----------------+--------+-------------------+
  ci-info: | Device |  Up  |           Address           |       Mask      | Scope  |     Hw-Address    |
  ci-info: +--------+------+-----------------------------+-----------------+--------+-------------------+
  ci-info: |  eth0  | True |         10.74.200.25        | 255.255.255.192 | global | 22:00:0a:4a:c8:19 |
  ci-info: |  eth0  | True | fe80::2000:aff:fe4a:c819/64 |        .        |  link  | 22:00:0a:4a:c8:19 |
  ci-info: |   lo   | True |          127.0.0.1          |    255.0.0.0    |  host  |         .         |
  ci-info: |   lo   | True |           ::1/128           |        .        |  host  |         .         |
  ci-info: +--------+------+-----------------------------+-----------------+--------+-------------------+
  ...
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'modules:config' at Wed, 07 Nov 2018 08:12:41 +0000. Up 35.63 seconds.
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'modules:final' at Wed, 07 Nov 2018 08:12:44 +0000. Up 38.98 seconds.
  Cloud-init v. 18.4-0ubuntu1~18.04.1 finished at Wed, 07 Nov 2018 08:12:45 +0000. Datasource DataSourceEc2Local.  Up 39.38 seconds

  4. Stop the instance.

  5. Start the instance.

  6. Try to SSH.
  Expected to happen: Instance has network and is working.
  What happens: Instance has no working network

  Getting instance log we can see:
  [   11.342357] cloud-init[412]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init-local' at Wed, 07 Nov 2018 08:21:07 +0000. Up 10.77 seconds.
  [  OK  ] Started Initial cloud-init job (pre-networking).
  [  OK  ] Reached target Network (Pre).
           Starting Network Service...
  [  OK  ] Started Network Service.
           Starting Network Name Resolution...
           Starting Wait for Network to be Configured...
  [  OK  ] Started Wait for Network to be Configured.
           Starting Initial cloud-init job (metadata service crawler)...
  [  OK  ] Started Network Name Resolution.
  [  OK  ] Reached target Host and Network Name Lookups.
  [  OK  ] Reached target Network.
  [   13.036207] cloud-init[637]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init' at Wed, 07 Nov 2018 08:21:08 +0000. Up 12.55 seconds.
  [   13.052849] cloud-init[637]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++
  [   13.100325] cloud-init[637]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
  [   13.121790] cloud-init[637]: ci-info: | Device |   Up  |  Address  |    Mask   | Scope |     Hw-Address    |
  [   13.129189] cloud-init[637]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
  [   13.144839] cloud-init[637]: ci-info: |  eth0  | False |     .     |     .     |   .   | 22:00:0b:0a:cb:2d |[  OK  ] Started Initial cloud-init job (metadata service crawler).
  [   13.158694] cloud-init
  [  OK  ] Reached target System Initialization.[637]: ci-info: |   lo   |  True | 127.0.0.1 | 255.0.0.0 |  host |         .         |
  [  OK  ] Started Daily apt download activities.
  [   13.179053] ] Started Message of the Day.
  cloud-init[637]: ci-info: |   lo   |  True |  ::1/128  |     .     |  host |         .         |[  OK  ] Started ACPI Events Check.
  [  OK  ] Reached target Paths.
  [   13.201012] cloud-init[637]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+] Listening on ACPID Listen Socket.
  [  OK  ] Listening on Open-iSCSI iscsid Socket.
  [   13.213993] cloud-init[  OK  ] Listening on D-Bus System Message Bus Socket.[637]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
           Starting Socket activation for snappy daemon.
  [   13.229707] cloud-init[637]: ci-info: +-------+-------------+---------+-----------+-------+
  [  OK  ] Started Daily apt upgrade and clean activities.
  [   13.244949] cloud-init[637]: ci-info: | Route | Destination | Gateway | Interface | Flags |] Listening on UUID daemon activation socket.
           Starting LXD - unix socket.
  [  OK  ] Started Daily Cleanup of Temporary Directories.[   13.256281] cloud-init[637]: ci-info: +-------+-------------+---------+-----------+-------+
  [  OK  ] Started Discard unused blocks once a week.
  [  OK  ] Reached target Timers.
  [  OK  ] Reached target Cloud-config availability.
  [   13.286424] cloud-init[637]: 
  [  OK  ] Reached target Network is Online.ci-info: +-------+-------------+---------+-----------+-------+

  It would be nice that the instances would work also after stop&start as they used to.
  My speculation for the problem is that /etc/netplan/50-cloud-init.yaml has:
              match:
                  macaddress: 22:00:0a:66:16:17

  Which changes in the stop&start and it is handled in wrong order. File is not generated before trying to get the network up and there is no device for that macaddress. But without console access and internal knowledge how this netplan/cloud-init/systemd thingie works it's kind of hard to
  pinpoint the problematic thing.

  But I did a small test. Edited /usr/lib/python3/dist-
  packages/cloudinit/net/netplan.py

              if if_type == 'physical':
                  # required_keys = ['name', 'mac_address']
                  eth = {
                      'set-name': ifname,
                      'match': ifcfg.get('match', None),
                  }
                  if eth['match'] is None:
                      macaddr = ifcfg.get('mac_address', None)
                      if macaddr is not None:
                          eth['match'] = {'macaddress': macaddr.lower()}
                      else:
                          del eth['match']
                          del eth['set-name']
  +                del eth['match']
  +                del eth['set-name']
                  _extract_addresses(ifcfg, eth, ifname)
                  ethernets.update({ifname: eth})

  And then run:
  cloud-init clean
  cloud-init init

  # cat /etc/netplan/50-cloud-init.yaml

  # This file is generated from information provided by
  # the datasource.  Changes to it will not persist across an instance.
  # To disable cloud-init's network configuration capabilities, write a file
  # /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  # network: {config: disabled}
  network:
      version: 2
      ethernets:
          eth0:
              dhcp4: true

  And then stopped the instance and started it.. It gets the network and
  works. And stopped again just to be sure it wasn't one time magic.
  Started and it works.

  So the problem really seems that the match/macaddress but how one
  should properly fix that, I'll leave for people who have made it
  misbehave like this.

  But I think there might be some pretty scared and annoyed people after
  stopping the instance and starting it, the instance is unreachable.
  Also depending on their skills to troubleshoot the problem and mount
  the volume to another instance and fix it (if it's ebs backed, if not,
  sorry, make a new instance).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1802073/+subscriptions