← Back to team overview

cloud-init-dev team mailing list archive

Re: [Bug 1835114] Re: [MIR] ec2-instance-connect

 

On Wed, Feb 5, 2020 at 10:55 PM Balint Reczey <balint.reczey@xxxxxxxxxxxxx>
wrote:

> @paelzer I attempt to answer the review comments below
>

Thank you @Rbalint for picking this up and driving it.
I'll skip sections that are ok now - thanks for adding/fixing all those!


> > - an ack by the cloud-init Team that this does not conflict with our
> usual
> >   services provided through cloud init
> >   - I'm subscribing the cloud-init Team to give it a look
>
> I've pinged the Team.
>

The feedback was added to the bug now and the mentioning of of conflicting
definitions isn't good IMHO.


> - Security review as "getting a key that then is allowed for local login"
> has
> >   one of the biggest "unintended backdoor" potential I ever seen
> >   - assigning security to the bug to evaluate this further
>
> @sarnold reviewed the latest changes, but he will share his observations
> here, too.
>

Lets see if that is better than comment #8 which wasn't too positive.

...

I've simplified the maintainer scripts which I detail later.
>

Thank you, while still not perfect the simplicity is at least easier to
manage.
Thanks for outlining the relevant use cases, I mostly agree but
invite everyone else to chime in if you see a case that breaks these
assumptions.


> > - the service should not run as root, use PrivateTmp and maybe a few
> other
> >   systemd service isolations
>
> I've forwarded this recommendation, too:
> https://github.com/aws/aws-ec2-instance-connect-config/issues/14
>

Thanks for forwarding, but IMHO it needs to be resolved before promotion.
I'm sure security would prefer having that as well - @sarnold - opinions on
this detail?

@Rbalint if you can, please drive this pro-actively @upstream e.g. by
providing them a suggestive patch.

...


> >   in an autopkgtest.
> >   But if one could do so that would be a great (albeit optional)
> addition.
>
> I believe the most practical way of testing the package would be adding
> tests to the CPC EC2 AMI tests.
> There is a guide on performing integration test on running EC2 instances
> in README.md.
>

Yeah that might be a good place, I'm not aware of the details of those
tests.
But it should not just be an idea, if we skip adding autopkgtest we should
do so on the base of having those tests in CPC-EC2-AMI tests.
Can you please work with the CPC Team to get those tests added?

...


> Users installing the package manually on older AMIs are installing
> it exactly to change the behaviour of SSH thus some care is expected on
> their side if they had local changes to ssh configuration.
>

Yeah, I agree that users installing it later kind of opt-in and those
should be ok in any case.
I'm more concerned about the effects it has by adding it to the images and
behavior changing e.g. due to collisions with cloud init as mentioned
by @rharper

I've added a check to postinst checking if there is other override
> for AuthorizedKeysCommand in /etc/ssh/sshd_config printing an error
> and asking for restarting ssh manually after they made sure that the
> new configuration with the drop-in fits their needs.
>

Thanks for reworking so much of the former complexity!


> > - There are more minor things like the preinst reporting "Created system
> user
> >   ec2-instance-connect" even when it already existed.
>
> Well, this is still present. At least it is printed only on new installs
> and not on upgrades, when most likely the user is not present.
>

It is just a perception thing, not bound to a particular single issue.
But it seems to have so many rough edges left from minot things like this
to design issues (net latency amplification) that it feels a bit
irresponsible to add it to the image by default as-is.

...
>
> > Also while thinking about it, ~5-8 curl calls fro every SSH login can be
> quite expensive.
> > I know it fortunately has an early exit but that still is 2 curl
> requests.
> >
> > If this is installed in any place without the endpoint at
> 169.254.169.254 being responsive and super fast this could lead to a very
> bad user experience.
> >
> > Examples:
> > 1. it checks the instance-id via curl, only then locally if it runs on
> EC2
> >    I think it really should check that ahead of time
>
> The package is planned to be present only on EC2 instances, thus I
> believe this would not be a important change (for upstream).
>

If "it will only be on EC2" would be a hard fact we can rely upon it would
not need the majority of pre-checks at all.
I guess this belongs to my "please cache results" request that is a general
design question.


>
> > 2. (more of a general design issue); doing that on every login feels
> like a massive overhead.
> >    Think of remote configuration management software that expects to run
> hundreds of ssh calls
> >    per second. We were bitten in the past by issues there e.g. slow MOTD
> generated on login.
> >    I really would want all those scripts to do some rate-limiting.
> >    That is either a full design change away from AuthorizedKeysCommand
> (probably too complex),
> >    or and that might be more doable a rate limit. Let it timestamp
> itself and do any execution
> >    except this check only once per 5 seconds. For an example load with
> 100 logins per second for
> >    10 seconds that would drop the overhead from 1000 to 2. And I think
> it would be fine to wait 5
> >    sec for a new key to be active.
>
> I think is is a valid issue, but I also think this is not in the scope
> of the MIR. Could you please open a separate bug to track it?
>

Done:
https://github.com/aws/aws-ec2-instance-connect-config/issues/15


> There is a similar issue causing connectivity delay tracked at upstream:
>
> Instance with no outbound connectivity seems to suffer from startup delay
> https://github.com/aws/aws-ec2-instance-connect-config/issues/8
>
 ...

> I'm not sure if the discussion with upstream already covered the two
> latter points, but we can continue investingating 2. in a separate bug.
>
>
This one is a good bug as well and should be resolved as well, but it is
about a different issue than mine.
2. is in my bug I filed (issue #15 as linked above)


------

I'd wish so much we could make this more opt-in, but it defies the
"installed and working by default" idea.
If you have any suggestions on how it might be installed but not generally
too active that would be great.

We have four cases here:
a) Instance without EIC
b) Instance with EIC installed, but not EIC activated
c) Instance with EIC installed, EIC activated on first boot
d) Instance with EIC installed, EIC activated later on

a) is opt-in on package install and sort of ok to enable it then
(maintainer scripts?)
b) needs nothing, but should NOT run all the EIC code every time
c) could for example require that the cloud adds something to metadata if
EIC is to be activated, then cloud init could enable it and itself stop
modifying files
d) is the hard part as until "EIC activated later on" you'd want it to not
do anything, but I lack a good idea how to detect that.

The above isn't perfect, and it won't help e.g. with the "fight who writes
config files with cloud-init".

In my mind is a (not-yet well formed) idea like "do all the checks if it is
EC2 and has EIC data once on boot, only then really enable it".
That way the majority of systems would not run all that code avoiding any
overhead, latency and security issues.
One single check still can delay the first-boot time, but at least not
every login later on.

A user could then to later on enable EIC do either:
1. reboot to pick it up
2. log in (any other way) and run a command to re-check

It could even recheck it in the background every now and then, but not too
often.
That way it would keep all the delay and risk out of any instance that
didn't want to use EIC.

That is a design change that the upstream would need to do, the same as my
request to use caching (see above).
If you want, you can start the discussion to make it better and drive it,
but that would surely be some time out to get it.

-- 
You received this bug notification because you are a member of cloud-
init Commiters, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1835114

Title:
  [MIR] ec2-instance-connect

Status in ec2-instance-connect package in Ubuntu:
  Incomplete

Bug description:
  [Availability]
  ec2-instance-connect is in the Ubuntu archive, and available for all supported releases. It is available on all architectures despite only being useful on Amazon EC2 instances.

  [Rationale]
  This package is useful on Amazon EC2 instances to make use of a new feature:
  Instance Connect; which allows storing SSH keys for access online in the Amazon systems. These SSH keys are then retrieved to be used by the system's SSH service, collated with pre-existing keys as deployed on the system.

  Installing the package enables the use of Instance Connect on an
  instance.

  [Security]
  This is a new package, and as such has no security history to speak of.

  [Quality Assurance]
  The package consists in a few shell scripts that are difficult to test by
  themselves due to the high reliance on Amazon's Instance Connect service;
  which is online and limited to use on Amazon instances.

  Given that it's a new package, there are no long-term outstanding bugs in
  Ubuntu or Debian. The package is only maintained in Ubuntu at the moment.

  This package deals with special "hardware"; it is only useful on Amazon
  instances, and its support is required as a default deployment on such
  instances when deployed with Ubuntu.

  [UI Standards]
  Not applicable. This service is command-line only and has no configuration options.

  [Dependencies]
  There are no special dependencies to speak of.

  [Standards Compliance]
  This package has been thoroughly reviewed by a few Canonical engineers, there are no standards violations known.

  [Maintenance]
  This package is to be owned by the Ubuntu Foundations team.

  [Background Information]
  This is Amazon-specific, as previously mentioned.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-instance-connect/+bug/1835114/+subscriptions


Follow ups

References