← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1927872] [NEW] Cannot encode a text script as base64 in write_files

 

Public bug reported:

Minimal user-data to reproduce:

#cloud-config
write_files:
- encoding: b64
  content: IyEvYmluL3NoCmVjaG8gaGVsbG8gd29ybGQ=
  path: /var/lib/cloud/scripts/per-once/hello.sh
  permissions: '0755'

This is the base64 encoding of:

#!/bin/sh
echo hello world

This will produce Errno 8 ENOEXEC. cloud-init will log 'Exec format
error. Missing #! in script?' but that's because it's guessing. The
actual reason is the shell will believe this to be a binary file (but
not a recognized executable format). You can run file
/var/lib/cloud/script/per-once/hello.sh and it will return 'Data' as the
file type. If you use cat to read the file, it will work since cat auto-
interprets as text, but if you use less, it will show the characters but
with nonsense bytes between each.

The problem is the Python base64 module returns a bytes object, and if
that is never decoded to UTF-8, writing it out in binary mode will write
out a binary file. If it is decoded and written in text mode, it will
work fine.

Suggested fix:

You can't blindly decode and interpret as text because that will break
attempts to actually write out a binary file, but a key can be added to
the write_files module indicating that a file is a text file. If that is
present, decode the bytes object.

Workaround:

Don't base64 encode text files. This works fine using the yaml | string
continuation feature instead.

Use case for base64 encoding:

I am creating a scriptable module to build VMs from installation media
iso files utilizing cloud-init as the installer automation method,
providing the ability to store scripts in a directory and generate a
user-data file programmatically and then writing it to an iso for use as
a NoCloud source. Imagine Packer but it works without needing to ssh to
the VM to provision it, so you can build very minimal VMs that might not
even have networking. It's marginally simpler to base64 encode all of
the scripts provided in order to avoid having to deal with yaml
indentation subtleties, but I can do this as-is with find and replace on
all line starts in the scripts, figuring out how much indentation to add
by tracking how much the "content" key had.

** Affects: cloud-init
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1927872

Title:
  Cannot encode a text script as base64 in write_files

Status in cloud-init:
  New

Bug description:
  Minimal user-data to reproduce:

  #cloud-config
  write_files:
  - encoding: b64
    content: IyEvYmluL3NoCmVjaG8gaGVsbG8gd29ybGQ=
    path: /var/lib/cloud/scripts/per-once/hello.sh
    permissions: '0755'

  This is the base64 encoding of:

  #!/bin/sh
  echo hello world

  This will produce Errno 8 ENOEXEC. cloud-init will log 'Exec format
  error. Missing #! in script?' but that's because it's guessing. The
  actual reason is the shell will believe this to be a binary file (but
  not a recognized executable format). You can run file
  /var/lib/cloud/script/per-once/hello.sh and it will return 'Data' as
  the file type. If you use cat to read the file, it will work since cat
  auto-interprets as text, but if you use less, it will show the
  characters but with nonsense bytes between each.

  The problem is the Python base64 module returns a bytes object, and if
  that is never decoded to UTF-8, writing it out in binary mode will
  write out a binary file. If it is decoded and written in text mode, it
  will work fine.

  Suggested fix:

  You can't blindly decode and interpret as text because that will break
  attempts to actually write out a binary file, but a key can be added
  to the write_files module indicating that a file is a text file. If
  that is present, decode the bytes object.

  Workaround:

  Don't base64 encode text files. This works fine using the yaml |
  string continuation feature instead.

  Use case for base64 encoding:

  I am creating a scriptable module to build VMs from installation media
  iso files utilizing cloud-init as the installer automation method,
  providing the ability to store scripts in a directory and generate a
  user-data file programmatically and then writing it to an iso for use
  as a NoCloud source. Imagine Packer but it works without needing to
  ssh to the VM to provision it, so you can build very minimal VMs that
  might not even have networking. It's marginally simpler to base64
  encode all of the scripts provided in order to avoid having to deal
  with yaml indentation subtleties, but I can do this as-is with find
  and replace on all line starts in the scripts, figuring out how much
  indentation to add by tracking how much the "content" key had.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1927872/+subscriptions


Follow ups