openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #05652
Keystone API Design Issues
Hello Y'all,
I'm writing the list with some of my thoughts as an user of the
Keystone 2.0 API.
Generally, I believe the API is too complicated, has too many 'hacks'
for backwards compatibility put into the wrong places, and pushes too
much logic into consumers and service implementers.
My experience with Keystone comes from several separate projects using the API:
1) A new Rackspace Service, not yet publicly announced, which uses the
Keystone API to validate tokens. (We wrote our own internal library in
Node.js for interacting with Keystone)
2) In Apache Libcloud, I implemented support for the Keystone API,
specifically to get tokens for a service like OpenStack Nova,
Rackspace Cloud Servers, Load Balancers or Cloud Files.
3) I also work with the team implementing a new Rackspace Control
Panel project. This project uses Libcloud for it's interaction with
Keystone, but has several more use cases beyond simple Username and
API Key validation.
Part 1: Specific Issues
A) The Token Validation API is fail deadly, because of support for
Tokens without a Tenant ID scope:
<http://docs.openstack.org/api/openstack-identity-service/2.0/content/GET_validateToken_v2.0_tokens__tokenId__Admin_API_Service_Developer_Operations-d1e1356.html>
When you are implementing a service that needs to validates tokens,
you pass in the tenant scope as the belongsTo parameter with the
Tenant ID. However, this parameter is optional. If a malicious
Tenant Id is passed in, for example if a service doesn't perform
sufficient validation, like letting a user pass in a & into the
tenantId, a token is considered valid for _all_ contexts. Now, in
theory, you should be looking at the roles provided under the user,
and the examples given in the OpenStack documentation echo back the
validated Tentant ID to you, however in practice, and as seen in
production environments, this response body includes a default
identity role, and does not echo back the validated Tenant ID.
B) Requiring consumers to pass Tenant IDs around is not a common
pattern in other cloud APIs. A consumer was already keeping track of
their username, apikey, and temporal token, and now they essentially
need to keep another piece of information around, the Tenant ID.
This seems like it is an unneeded variable. For example, Amazon
implements AWS Identity and Access Management by changing the API key
& secret that is used against the API depending on the role of the
account -- this hides the abstraction away from both client libraries
and validating services -- they still just care about the API key and
secret, and do not need to pass around an extra Tenant ID.
C) Requiring Services to encode the Tenant ID into their URLs is not a
common design, and can cause issues. By encoding identity both into
the Token and the Tenant in the URL, there are now multiple attack
vectors from a security perspective, and can make URL routing in some
frameworks more painful.
D) The Service Catalog makes it difficult to run testing, beta, and
staging environments. This is because in practice many services are
not registered in the service catalog. To work around this, we
commonly see that a regex is used to extract the Tenant Id from
another URL, and then the client builds a new URL. You can see this
even being recommended by Rackspace in the Disqus comments ont he
Cloud DNS API here:
<http://docs.rackspace.com/cdns/api/v1.0/cdns-devguide/content/Service_Access_Endpoints-d1e753.html>
E) The Service catalog should make it easy to expose many tenants to
the same service. Currently you need to build a unique tentant
specific URL for each service. The most common use case is enabling
all or a large set of tenants to access the same service, and it seems
like this use case is not well covered.
F) Features like API Keys, are marked as Extensions to the API, and
not part of the core API.
G) Support for multifactor authentication (MFA) is not currently
available in any way. This is very worrisome since the only 'in core'
Authentication mechanism is a username and a password. With previous
systems, where the Username and Password were not 'owned' by a service
like Keystone, products like the control panel could implement MFA
themselves, but now that you can authenticate to the backend APIs
using only a password, Keystone must also support MFA. Password Auth
is very different from API Key auth -- yes, they are both in theory
randomly generated and big, but in practice Passwords are weak, and
reused widely, while only API keys are big and random.
H) There doesn't seem to be an open discussion about where the
Keystone API is going -- I hear mumbles about $OtherBigCompanies
wanting to extend the Keystone APIs in various ways, but there is no
discussion on this mailing list. I know of at least 4 implementations
of Keystone within just Rackspace, but all of them are adding custom
extensions to even accomplish their baseline of use cases.
Part 2: My Thoughts / Recommendations
A) More open discussion about the issues in Keystone. Issues won't
get fixed until there is an open and honest dialog between the various
groups which depend on Keystone. Identity, Authentication and
Authorization must be rock solid -- everything else in OpenStack and
many other commercial products are built upon it.
B) I believe fundamentally that the data model for service catalogs
as exposed to the user is over complicated and broken. I would prefer
that you hit the auth service with your username + apikey or password.
This returns to you a list of tenantIds aka billable cloud accounts.
Each of those records have a unique API Token. When you hit a service
using that Token, you do not include a TenantId. The service validates
the token, and its always in the context of a single tenant.
I believe we should consider changing the get Token API to return
something more like this:
{access: [
{tenantId: "42",
token: 'XXXXXX',
serviceCatalog: [{type: 'compute', baseURL: 'https://..../',
region: 'us-west'} ... ]
}]};
The major change here is to make tenants a first level object, instead
of being a component of the service catalog -- each tenant has their
potentially unique service catalog, and each tenant has a unique API
token that is valid for this users's roles on it.
This would slightly complicate "service users" with access to many
tenants, but it would massively simplify the common use cases of a
user with a small number of tenants, and for a service which needs to
validate the token. A service user with access to many tenants would
need to fetch a unique token for each tenant, but this is the place to
put the complexity -- people writing a service that spans hundreds or
thousands of tenants are already doing something complicated --
fetching a unique auth token is the least of their issues.
This reduces the number of variables on both the consumer and the
service for passing around, and makes it less fail-deadly.
This approach also eliminates the need to encode the tenant ID into
the URLs of services.
C) There needs to be a reduction in the number of implementations of
Keystone, and a reduction in the number of extensions needed to even
operate a reasonable baseline of services. More things, like API
keys and how one would do a multifactor authentication backend must be
part of the core specification, and we should strive to have the
reference implementation actually used by companies in the ecosystem,
instead of how today pretty much every $BigCo using OpenStack is
implementing their own Keystone service from scratch -- people will
always have custom needs, but those should be plugins to a solid
reference implementation as much as possible, instead of today where
everyone is rebuilding it from scratch.
Thoughts?
Thanks,
Paul
Follow ups