← Back to team overview

cloud-init team mailing list archive

user-data format/content-type?


Sorry for whatever stupid thing I’m doing, but I can’t get cloud-init to correctly ingest user-data, for example when trying to do cloud-init query


cloud-init 19.4 (also affects 20.4?)
Amazon EC2
CentOS 7 (+CentOS 8)
More specifically, when I try to cloud-init query... (or access the same in a custom Jinja template) on the user data, it fails with

File "/usr/lib/python<ver>/site-packages/cloudinit/cmd/query.py", line 141, in handle_args
    response = response[var]
TypeError: string indices must be integers
I went digging into the cloud-init code, because this didn’t make any sense. We’re using 19.4 (because CentOS 7), but I hit the same thing with 20.4 on a test CentOS 8 box.

The top of the yaml doc looks like


# https://cloudinit.readthedocs.io/en/latest/topics/examples.html#install-and-run-chef-recipes

  node_name: null
  exec: false   # don't run by default, terraform will set this
  omnibus_url: https://omnitruck.cinc.sh/install.sh
The YAML is a valid doc, not binary (file says ASCII text) It begins with #cloud-config etc. It doesn’t matter if I explicitly use query -u, or allow the engine to pick up the copy in /var/lib/cloud/instance/user-data.txt.

YAML - As far as I can tell, cloud-init is ingesting the user-data file as a single block of plain text beginning with chef..., rather than parsing it as yaml.

JSON - I used yq to convert the yaml to JSON, but the result was the same - cloud-init decided it was a giant block of text instead of converting it to a dictionary.

This is what causes the TypeError above. userdata is a dict as it should be, but the value of userdata is a string, not a dict. cloud-init tries to treat it like a dict as it iterates the dot path to the desired key, ie userdata.chef.exec.

There’s a couple of things about this ingestion that are confusing me. It doesn’t seem to have any issue with the AWS supplied metadata. query.py appears to expect the user-data(?) to be json.

L137: instance_json = util.load_file(instance_data_fn)
L145: instance_data = util.load_json(instance_json)
More than a mere naming convention, load_json in util.py tries to do exactly that. Weirder still, it looks like it is assuming the user data content is a binary blob that needs to be decoded.

L1507: def load_json(text, root_types=(dict,)):
L1508:   decoded = json.loads(decode_binary(text))
Other aspects of cloud-init seem to handle the supplied user-data. write_files (not shown above) acts as expected, and the stock cc_chef Jinja template renders values from user-data.

Again, my apologies if I’m doing something dumb. I thought I did, but I obviously don’t understand what type of data cloud-init is expecting here.