← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1960944] Re: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes

 

While we do have sporadic messages like this in our nginx error.log,
they started piling up around the time this issue was reported to us,
starting with this message:

2022/02/15 01:49:24 [error] 3341359#3341359: *1929977 upstream timed out
(110: Connection timed out) while reading response header from upstream,
client: 10.229.95.139, server: , request: "POST
/MAAS/metadata/status/ww4mgk HTTP/1.1", upstream:
"http://10.155.212.2:5240/MAAS/metadata/status/ww4mgk";, host:
"10.229.32.21:5248"

Around this time we started seeing these pile up in rackd.log:
2022-02-15 01:40:07 provisioningserver.rpc.clusterservice: [critical] Failed to contact region. (While requesting RPC info at http://localhost:5240/MAAS).

Our regiond processes are running, and I don't see anything that seems
abnormal in the regiond log around this time. However, these symptoms
reminded me of a similar issue in bug 1908452, so I started debugging it
similarly. Like bug 1908452, I see one regiond process stuck in a recv
call:

root@maas:/var/snap/maas/common/log# strace -p 3340720
strace: Process 3340720 attached
recvfrom(23, 

All the other regiond processes are making progress, but not this one.

The server it is talking to appears to be this canonical server, which I
can't currently resolve:

root@maas:/var/snap/maas/common/log# lsof -i -a -p  3340720 | grep 23
python3 3340720 root   23u  IPv4 3487880288      0t0  TCP maas:42848->https-services.aerodent.canonical.com:http (ESTABLISHED)
root@maas:/var/snap/maas/common/log# host https-services.aerodent.canonical.com
Host https-services.aerodent.canonical.com not found: 3(NXDOMAIN)

However, I suspect it maybe related to image fetching again. In our
regiond logs, I see that the the last log entry related to images
appears to have been about an hour before things locked up:

root@maas:/var/snap/maas/common/log# grep image regiond.log | tail -1
2022-02-15 00:38:51 regiond: [info] 127.0.0.1 GET /MAAS/images-stream/streams/v1/maas:v2:download.json HTTP/1.1 --> 200 OK (referrer: -; agent: python-simplestreams/0.1)

Prior to that, we have log entries every hour, but none after. So maybe
simplestreams has other places that need a timeout?

** Changed in: cloud-init
       Status: New => Invalid

** Also affects: simplestreams
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1960944

Title:
  cloudinit.sources.DataSourceNotFoundException: Did not find any data
  source, searched classes

Status in cloud-init:
  Invalid
Status in MAAS:
  New
Status in simplestreams:
  New

Bug description:
  Not able to deploy baremetal (arm64 and amd64) on a 
  snap-based MAAS: 3.1.0 (maas      3.1.0-10901-g.f1f8f1505  18199  3.1/stable) 

  from MAAS event log: 
  ```
  Tue, 15 Feb. 2022 17:35:33	Node changed status - From 'Deploying' to 'Failed deployment'
   Tue, 15 Feb. 2022 17:35:33	Marking node failed - Node operation 'Deploying' timed out after 30 minutes.
   Tue, 15 Feb. 2022 17:07:44	Node installation - 'cloudinit' searching for network data from DataSourceMAAS
   Tue, 15 Feb. 2022 17:06:44	Node installation - 'cloudinit' attempting to read from cache [trust]
   Tue, 15 Feb. 2022 17:06:42	Node installation - 'cloudinit' attempting to read from cache [check]
   Tue, 15 Feb. 2022 17:05:29	Performing PXE boot
   Tue, 15 Feb. 2022 17:05:29	PXE Request - installation
   Tue, 15 Feb. 2022 17:03:52	Node powered on
  ```

  
  Server console log shows: 

  ```
  ubuntu login:          Starting Message of the Day...
  [  OK  ] Listening on Socket unix for snap application lxd.daemon.
           Starting Service for snap application lxd.activate...
  [  OK  ] Finished Service for snap application lxd.activate.
  [  OK  ] Started snap.lxd.hook.conf…-4400-96a8-0c5c9e438c51.scope.
           Starting Time & Date Service...
  [  OK  ] Started Time & Date Service.
  [  OK  ] Finished Wait until snapd is fully seeded.
           Starting Apply the settings specified in cloud-config...
  [  OK  ] Reached target Multi-User System.
  [  OK  ] Reached target Graphical Interface.
           Starting Update UTMP about System Runlevel Changes...
  [  OK  ] Finished Update UTMP about System Runlevel Changes.
  [  322.036861] cloud-init[2034]: Can not apply stage config, no datasource found! Likely bad things to come!
  [  322.037477] cloud-init[2034]: ------------------------------------------------------------
  [  322.037907] cloud-init[2034]: Traceback (most recent call last):
  [  322.038341] cloud-init[2034]:   File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 521, in main_modules
  [  322.038783] cloud-init[2034]:     init.fetch(existing="trust")
  [  322.039181] cloud-init[2034]:   File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 411, in fetch
  [  322.039584] cloud-init[2034]:     return self._get_data_source(existing=existing)
  [  322.040031] cloud-init[2034]:   File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 311, in _get_data_source
  [  322.040381] cloud-init[2034]:     (ds, dsname) = sources.find_source(self.cfg,
  [  322.040807] cloud-init[2034]:   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 854, in find_source
  [  322.041221] cloud-init[2034]:     raise DataSourceNotFoundException(msg)
  [  322.041659] cloud-init[2034]: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: ()
  [  322.042076] cloud-init[2034]: ------------------------------------------------------------
  [FAILED] Failed to start Apply the …ngs specified in cloud-config.
  See 'systemctl status cloud-config.service' for details.
           Starting Execute cloud user/final scripts...
  [  324.029369] cloud-init[2042]: Can not apply stage final, no datasource found! Likely bad things to come!
  [  324.029979] cloud-init[2042]: ------------------------------------------------------------
  [  324.030420] cloud-init[2042]: Traceback (most recent call last):
  [  324.030835] cloud-init[2042]:   File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 521, in main_modules
  [  324.031258] cloud-init[2042]:     init.fetch(existing="trust")
  [  324.031647] cloud-init[2042]:   File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 411, in fetch
  [  324.032111] cloud-init[2042]:     return self._get_data_source(existing=existing)
  [  324.032483] cloud-init[2042]:   File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 311, in _get_data_source
  [  324.032865] cloud-init[2042]:     (ds, dsname) = sources.find_source(self.cfg,
  [  324.033292] cloud-init[2042]:   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 854, in find_source
  [  324.033690] cloud-init[2042]:     raise DataSourceNotFoundException(msg)
  [  324.034112] cloud-init[2042]: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: ()
  [  324.034559] cloud-init[2042]: ------------------------------------------------------------
  [FAILED] Failed to start Execute cloud user/final scripts.
  See 'systemctl status cloud-final.service' for details.
  [  OK  ] Reached target Cloud-init target.
  [  OK  ] Finished Message of the Day.
  ```

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1960944/+subscriptions