← Back to team overview

cloud-init-dev team mailing list archive

Re: [Merge] ~powersj/cloud-init:cii-enable-ec2 into cloud-init:master

 

* (from the commit message).
  "The default instance type is set to t2.micro. This is the smallest type
   that will work for both ebs and instance-store."

   that is not true.  per https://aws.amazon.com/ec2/instance-types/
   t2.* are 'EBS-only'.

 * user-data, I think probably you should pass None if there is None.
   I expect that boto is not just checking a true-ish value.
   I know from experience that running an instance with '' as user-data
   differs from None (in the former you do not get a 'user-data' field
   in the meta-data service, in the latter you do get one).

   it looks like you need to not pass UserData in the kwargs at all.
     https://github.com/boto/botocore/blob/develop/botocore/handlers.py#L528

 * last... i dont need this, but it woudl be nice if you (re)started the EC2Image._instance() only when necessary.  Rather than right after snapshot. Just because i think most probable use case is to not ever restart it.

   that also tells us passing bytes (rather than string) is ok, and that
   boto will always base64 encode for you.

 * Lets add a console_log() to the EC2Instance please.
   http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.Instance.console_output


Diff comments:

> diff --git a/tests/cloud_tests/platforms/ec2/image.py b/tests/cloud_tests/platforms/ec2/image.py
> new file mode 100644
> index 0000000..dab8601
> --- /dev/null
> +++ b/tests/cloud_tests/platforms/ec2/image.py
> @@ -0,0 +1,115 @@
> +# This file is part of cloud-init. See LICENSE file for license information.
> +
> +"""EC2 Image Base Class."""
> +
> +from ..images import Image
> +from .snapshot import EC2Snapshot
> +from tests.cloud_tests import LOG
> +
> +
> +class EC2Image(Image):
> +    """EC2 backed image."""
> +
> +    platform_name = 'ec2'
> +
> +    def __init__(self, platform, config, image_ami):
> +        """Set up image.
> +
> +        @param platform: platform object
> +        @param config: image configuration
> +        @param image_ami: string of image ami ID
> +        """
> +        super(EC2Image, self).__init__(platform, config)
> +        self._img_instance = None
> +        self.image_ami = image_ami
> +        self.image_ami_edited = False
> +
> +    @property
> +    def _instance(self):
> +        """Internal use only, returns a running instance"""
> +        if not self._img_instance:
> +            self._img_instance = self.platform.create_instance(
> +                self.properties, self.config, self.features,
> +                self.image_ami, user_data=None)
> +            self._img_instance.start(wait=True, wait_for_cloud_init=True)
> +        return self._img_instance
> +
> +    @property
> +    def properties(self):
> +        """Dictionary containing: 'arch', 'os', 'version', 'release'."""
> +        return {
> +            'arch': self.config['arch'],
> +            'os': self.config['family'],
> +            'release': self.config['release'],
> +            'version': self.config['version'],
> +        }
> +
> +    def destroy(self):
> +        """Deregister (delete) a custom AMI on EC2.
> +
> +        This does not delete any instances, only removes the saved AMI.
> +
> +        The removal of the images and all other items is handled by the
> +        framework. In some cases we want to keep the images, so let the
> +        framework decide whether to keep or destroy everything.
> +        """
> +        if self.image_ami_edited:

I think what you *should* be doing here is destroying the instance.
the image_ami_edited is a reference to a snapshot that was created based on this image.

that is completely detached from this image, and destroying an EC2Image should not remove snapshots made from it.


In the NoCloud platform, the Image is a thing on disk locally, so destroying it will remove that local file.  But in the EC2 case, the Image is really just a ami-id with possibly a instance associated with it.

does that make sense?

> +            LOG.debug('removing custom AMI')
> +            self.platform.ec2_client.deregister_image(
> +                ImageId=self.image_ami_edited
> +            )
> +
> +        super(EC2Image, self).destroy()
> +
> +    def _execute(self, *args, **kwargs):
> +        """Execute command in image, modifying image."""
> +        return self._instance._execute(*args, **kwargs)
> +
> +    def push_file(self, local_path, remote_path):
> +        """Copy file at 'local_path' to instance at 'remote_path'."""
> +        return self._instance.push_file(local_path, remote_path)
> +
> +    def run_script(self, *args, **kwargs):
> +        """Run script in image, modifying image.
> +
> +        @return_value: script output
> +        """
> +        return self._instance.run_script(*args, **kwargs)
> +
> +    def snapshot(self):
> +        """Create snapshot of image, block until done.
> +
> +        Will return base image_ami if no instance has been booted, otherwise
> +        will run the clean script, shutdown the instance, create a custom
> +        AMI, and use that AMI once available.
> +        """
> +        if not self._img_instance:
> +            return EC2Snapshot(self.platform, self.properties, self.config,
> +                               self.features, self.image_ami)
> +
> +        if self.config.get('boot_clean_script'):
> +            self._img_instance.run_script(self.config.get('boot_clean_script'))
> +
> +        self._img_instance.shutdown(wait=True)
> +
> +        LOG.debug('creating custom AMI from instance')

mgiht as well print the instance-id here.

I think you're not tagging this newly created thing anywhere.

> +        response = self.platform.ec2_client.create_image(
> +            Name='%s-%s' % (self.platform.tag, self.image_ami),
> +            InstanceId=self._img_instance.instance.instance_id
> +        )
> +        self.image_ami_edited = response['ImageId']
> +
> +        # Create image and wait until it is in the 'available' state
> +        image = self.platform.ec2_resource.Image(self.image_ami_edited)
> +        image.wait_until_exists()
> +        waiter = self.platform.ec2_client.get_waiter('image_available')
> +        waiter.wait(ImageIds=[image.id])
> +        image.reload()
> +
> +        # Re-start instance in case additional customization required
> +        self._img_instance.start(wait=True)
> +
> +        return EC2Snapshot(self.platform, self.properties, self.config,
> +                           self.features, self.image_ami_edited)
> +
> +# vi: ts=4 expandtab
> diff --git a/tests/cloud_tests/platforms/ec2/instance.py b/tests/cloud_tests/platforms/ec2/instance.py
> new file mode 100644
> index 0000000..3e02518
> --- /dev/null
> +++ b/tests/cloud_tests/platforms/ec2/instance.py
> @@ -0,0 +1,111 @@
> +# This file is part of cloud-init. See LICENSE file for license information.
> +
> +"""Base EC2 instance."""
> +import os
> +
> +import botocore
> +
> +from ..instances import Instance
> +from tests.cloud_tests import LOG, util
> +
> +
> +class EC2Instance(Instance):
> +    """EC2 backed instance."""
> +
> +    platform_name = "ec2"
> +    _ssh_client = None
> +
> +    def __init__(self, platform, properties, config, features,
> +                 image_ami, user_data=None):
> +        """Set up instance.
> +
> +        @param platform: platform object
> +        @param properties: dictionary of properties
> +        @param config: dictionary of configuration values
> +        @param features: dictionary of supported feature flags
> +        @param image_ami: AWS AMI ID for image to use
> +        @param user_data: test user-data to pass to instance
> +        """
> +        super(EC2Instance, self).__init__(
> +            platform, image_ami, properties, config, features)
> +
> +        self.image_ami = image_ami
> +        self.instance = None
> +        self.user_data = user_data if user_data else b''

i dont think this is needed. just
 self.user_data = user_data

> +        self.ssh_ip = None
> +        self.ssh_port = 22
> +        self.ssh_key_file = os.path.join(
> +            platform.config['data_dir'], platform.config['private_key'])
> +        self.ssh_pubkey_file = os.path.join(
> +            platform.config['data_dir'], platform.config['public_key'])
> +
> +    def destroy(self):
> +        """Clean up instance."""
> +        if self._ssh_client:

failing to take the ssh client down should not result in failure to destroy the instance.

maybe try:
except Exception as e:
    log.warn("bad news , failed to take the ssh client down")

> +            self._ssh_client.close()
> +            self._ssh_client = None
> +
> +        if self.instance:
> +            LOG.debug('destroying instance %s', self.instance.id)
> +            self.instance.terminate()
> +            self.instance.wait_until_terminated()
> +
> +        super(EC2Instance, self).destroy()
> +
> +    def _execute(self, command, stdin=None, env=None):
> +        """Execute command on instance."""
> +        env_args = []
> +        if env:
> +            env_args = ['env'] + ["%s=%s" for k, v in env.items()]
> +
> +        return self._ssh(['sudo'] + env_args + list(command), stdin=stdin)
> +
> +    def start(self, wait=True, wait_for_cloud_init=False):
> +        """Start instance on EC2 with the platfrom's VPC."""
> +        if self.instance:
> +            LOG.debug('starting instance %s', self.instance.id)
> +            self.instance.start()
> +        else:
> +            LOG.debug('launching instance')
> +            try:
> +                instances = self.platform.ec2_resource.create_instances(
> +                    ImageId=self.image_ami,
> +                    InstanceType=self.platform.instance_type,
> +                    KeyName=self.platform.key_name,
> +                    MaxCount=1,
> +                    MinCount=1,
> +                    SecurityGroupIds=[self.platform.security_group.id],
> +                    SubnetId=self.platform.subnet.id,
> +                    TagSpecifications=[{
> +                        'ResourceType': 'instance',
> +                        'Tags': [{
> +                            'Key': 'Name', 'Value': self.platform.tag
> +                        }]
> +                    }],
> +                    UserData=self.user_data,
> +                )
> +
> +                self.instance = instances[0]
> +
> +            except botocore.exceptions.ClientError as error:
> +                error_msg = error.response['Error']['Message']
> +                raise util.PlatformError('start', error_msg)
> +
> +        if wait:
> +            self.instance.wait_until_running()
> +            self.instance.reload()
> +            self.ssh_ip = self.instance.public_ip_address

if wait == false, then you never set ssh_ip

> +            self._wait_for_system(wait_for_cloud_init)
> +
> +    def shutdown(self, wait=True):
> +        """Shutdown instance."""
> +        LOG.debug('stopping instance %s', self.instance.id)
> +        self.instance.stop()
> +
> +        if wait:
> +            self.instance.wait_until_stopped()
> +            self.instance.reload()
> +
> +        super(EC2Instance, self).shutdown()
> +
> +# vi: ts=4 expandtab
> diff --git a/tests/cloud_tests/platforms/ec2/platform.py b/tests/cloud_tests/platforms/ec2/platform.py
> new file mode 100644
> index 0000000..baadfbb
> --- /dev/null
> +++ b/tests/cloud_tests/platforms/ec2/platform.py
> @@ -0,0 +1,227 @@
> +# This file is part of cloud-init. See LICENSE file for license information.
> +
> +"""Base EC2 platform."""
> +from datetime import datetime
> +import os
> +
> +import boto3
> +import botocore
> +
> +from ..platforms import Platform
> +from .image import EC2Image
> +from .instance import EC2Instance
> +from cloudinit import util as c_util
> +from tests.cloud_tests import LOG
> +
> +
> +class EC2Platform(Platform):
> +    """EC2 test platform."""
> +
> +    platform_name = 'ec2'
> +    ipv4_cidr = '192.168.1.0/20'
> +
> +    def __init__(self, config):
> +        """Set up platform."""
> +        super(EC2Platform, self).__init__(config)
> +        # Used for unique VPC, SSH key, and custom AMI generation naming
> +        self.tag = '%s-%s' % (
> +            config['tag'], datetime.now().strftime('%Y%m%d%H%M%S'))
> +        self.instance_type = config['instance-type']
> +
> +        try:
> +            self.ec2_client = boto3.client('ec2')
> +            self.ec2_resource = boto3.resource('ec2')
> +            self.ec2_region = boto3.Session().region_name
> +        except botocore.exceptions.NoRegionError:
> +            raise RuntimeError(
> +                'Please configure default region in $HOME/aws/config')
> +        except botocore.exceptions.NoCredentialsError:
> +            raise RuntimeError(
> +                'Please configure ec2 credentials in $HOME/aws/credentials')
> +
> +        self.key_name = self._upload_public_key(config)
> +        self.vpc = self._create_vpc()
> +        self.internet_gateway = self._create_internet_gateway()
> +        self.subnet = self._create_subnet()
> +        self.routing_table = self._create_routing_table()
> +        self.security_group = self._create_security_group()
> +
> +    def create_instance(self, properties, config, features,
> +                        image_ami, user_data=None):
> +        """Create an instance
> +
> +        @param src_img_path: image path to launch from
> +        @param properties: image properties
> +        @param config: image configuration
> +        @param features: image features
> +        @param image_ami: string of image ami ID
> +        @param user_data: test user-data to pass to instance
> +        @return_value: cloud_tests.instances instance
> +        """
> +        return EC2Instance(self, properties, config, features,
> +                           image_ami, user_data)
> +
> +    def destroy(self):
> +        """Delete SSH keys, terminate all instances, and delete VPC."""
> +        for instance in self.vpc.instances.all():
> +            LOG.debug('waiting for instance %s termination', instance.id)
> +            instance.terminate()
> +            instance.wait_until_terminated()
> +
> +        if self.key_name:
> +            LOG.debug('deleting SSH key %s', self.key_name)
> +            self.ec2_client.delete_key_pair(KeyName=self.key_name)
> +
> +        if self.security_group:
> +            LOG.debug('deleting secuirty group')
> +            self.security_group.delete()
> +
> +        if self.subnet:
> +            LOG.debug('deleting subnet')
> +            self.subnet.delete()
> +
> +        if self.routing_table:
> +            LOG.debug('deleting routing table')
> +            self.routing_table.delete()
> +
> +        if self.internet_gateway:
> +            LOG.debug('deleting internet gateway')
> +            self.internet_gateway.detach_from_vpc(VpcId=self.vpc.id)
> +            self.internet_gateway.delete()
> +
> +        if self.vpc:
> +            LOG.debug('deleting vpc')
> +            self.vpc.delete()
> +
> +    def get_image(self, img_conf):
> +        """Get image using specified image configuration.
> +
> +        @param img_conf: configuration for image
> +        @return_value: cloud_tests.images instance
> +        """
> +        if img_conf['root-store'] == 'ebs':
> +            root_store = 'ssd'
> +        elif img_conf['root-store'] == 'instance-store':
> +            root_store = 'instance'
> +        else:
> +            raise RuntimeError('Unknown root-store type: %s' %
> +                               (img_conf['root-store']))
> +
> +        filters = [
> +            'arch=%s' % c_util.get_architecture(),

it seems somewhat arbitrary to base the instance type you  launch on ec2 to be based upon the architecture where you're running the test suite.  that  makes sense for NoCloud or LXD (at very least as defaults), but on Ec2 not so much.

Ie, this fails if you run it on a Rasberry pi.

I think might as well just hard code 'amd64' for now.

> +            'endpoint=https://ec2.%s.amazonaws.com' % self.ec2_region,
> +            'region=%s' % self.ec2_region,
> +            'release=%s' % img_conf['release'],
> +            'root_store=%s' % root_store,
> +            'virt=hvm',
> +        ]
> +
> +        LOG.debug('finding image using streams')
> +        image = self._query_streams(img_conf, filters)
> +
> +        try:
> +            image_ami = image['id']
> +        except KeyError:
> +            raise RuntimeError('No images found for %s!' % img_conf['release'])
> +
> +        LOG.debug('found %s', image_ami)
> +        image = EC2Image(self, img_conf, image_ami)
> +        return image
> +
> +    def _create_internet_gateway(self):
> +        """Create Internet Gateway and assign to VPC."""
> +        LOG.debug('creating internet gateway')
> +        internet_gateway = self.ec2_resource.create_internet_gateway()
> +        internet_gateway.attach_to_vpc(VpcId=self.vpc.id)
> +        self._tag_resource(internet_gateway)
> +
> +        return internet_gateway
> +
> +    def _create_routing_table(self):
> +        """Update default routing table with internet gateway.
> +
> +        This sets up internet access between the VPC via the internet gateway
> +        by configuring routing tables for IPv4 and IPv6.
> +        """
> +        LOG.debug('creating routing table')
> +        route_table = self.vpc.create_route_table()
> +        route_table.create_route(DestinationCidrBlock='0.0.0.0/0',
> +                                 GatewayId=self.internet_gateway.id)
> +        route_table.create_route(DestinationIpv6CidrBlock='::/0',
> +                                 GatewayId=self.internet_gateway.id)
> +        route_table.associate_with_subnet(SubnetId=self.subnet.id)
> +        self._tag_resource(route_table)
> +
> +        return route_table
> +
> +    def _create_security_group(self):
> +        """Enables ingress to default VPC security group."""
> +        LOG.debug('creating security group')
> +        security_group = self.vpc.create_security_group(
> +            GroupName=self.tag, Description='integration test security group')
> +        security_group.authorize_ingress(
> +            IpProtocol='-1', FromPort=-1, ToPort=-1, CidrIp='0.0.0.0/0')
> +        self._tag_resource(security_group)
> +
> +        return security_group
> +
> +    def _create_subnet(self):
> +        """Generate IPv4 and IPv6 subnets for use."""
> +        ipv6_cidr = self.vpc.ipv6_cidr_block_association_set[0][
> +            'Ipv6CidrBlock'][:-2] + '64'
> +
> +        LOG.debug('creating subnet')
> +        subnet = self.vpc.create_subnet(CidrBlock=self.ipv4_cidr,
> +                                        Ipv6CidrBlock=ipv6_cidr)
> +        modify_subnet = subnet.meta.client.modify_subnet_attribute
> +        modify_subnet(SubnetId=subnet.id,
> +                      MapPublicIpOnLaunch={'Value': True})
> +        self._tag_resource(subnet)
> +
> +        return subnet
> +
> +    def _create_vpc(self):
> +        """Setup AWS EC2 VPC or return existing VPC."""
> +        LOG.debug('creating new vpc')
> +        try:
> +            vpc = self.ec2_resource.create_vpc(
> +                CidrBlock=self.ipv4_cidr,
> +                AmazonProvidedIpv6CidrBlock=True)
> +        except botocore.exceptions.ClientError as e:
> +            raise RuntimeError(e)
> +
> +        vpc.wait_until_available()
> +        self._tag_resource(vpc)
> +
> +        return vpc
> +
> +    def _tag_resource(self, resource):
> +        """Tag a resouce with the specified tag.
> +
> +        This makes finding and deleting resources specific to this testing
> +        much easier to find.
> +
> +        @param resource: resource to tag"""
> +        tag = {
> +            'Key': 'Name',
> +            'Value': self.tag
> +        }
> +        resource.create_tags(Tags=[tag])
> +
> +    def _upload_public_key(self, config):
> +        """Generate random name and upload SSH key with that name.
> +
> +        @param config: platform config
> +        @return: string of ssh key name
> +        """
> +        key_file = os.path.join(config['data_dir'], config['public_key'])
> +        with open(key_file, 'r') as file:
> +            public_key = file.read().strip('\n')
> +
> +        LOG.debug('uploading SSH key %s', self.tag)
> +        self.ec2_client.import_key_pair(KeyName=self.tag,
> +                                        PublicKeyMaterial=public_key)
> +
> +        return self.tag
> +
> +# vi: ts=4 expandtab
> diff --git a/tests/cloud_tests/util.py b/tests/cloud_tests/util.py
> index 2aedcd0..151589e 100644
> --- a/tests/cloud_tests/util.py
> +++ b/tests/cloud_tests/util.py
> @@ -447,6 +447,19 @@ class InTargetExecuteError(c_util.ProcessExecutionError):
>              reason=reason)
>  
>  
> +class PlatformError(IOError):
> +    """Error type for platform errors."""
> +
> +    default_desc = 'unexpected error in platform.'
> +
> +    def __init__(self, cmd, description=None):

'cmd' seems somewhat arbitrary here.
even  more arbitrary to "shell quote" it.

and the only place you use it, you pass in 'start' 
  tests/cloud_tests/platforms/ec2/instance.py

ie, 'cmd' really makes me think this is expected to be subprocess.<somethign>

maybe 'operation' ?

> +        """Init error and parent error class."""
> +        description = description if description else self.default_desc
> +
> +        message = '%s\ncmd:%s' % (description, shell_quote(cmd))
> +        IOError.__init__(self, message)
> +
> +
>  class TempDir(object):
>      """Configurable temporary directory like tempfile.TemporaryDirectory."""
>  


-- 
https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/335186
Your team cloud-init commiters is requested to review the proposed merge of ~powersj/cloud-init:cii-enable-ec2 into cloud-init:master.


References