yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #15144
[Bug 1233838] Re: cms token_id's are not URL safe nor RFC compliant
** Changed in: keystone
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1233838
Title:
cms token_id's are not URL safe nor RFC compliant
Status in OpenStack Identity (Keystone):
Fix Released
Bug description:
<pre>
The token id for a cms signed token is generated via
cms.cms_to_token() which extracts the base64 cms data to use as the
token id. The '/' character in the base64 text was
replaced with the '-' character in an attempt to make the
resulting token id URL safe. The token id is used in both URL's and in
HTTP header values.
There are a few problems with this approach.
1) The result is still not url safe due the presence of the '+'
character and the pad character '='. Both of these characters are
reserved for the query component and thus would need to be escaped
further disrupting the base64 alphabet. See RFC-2396 "Uniform Resource
Identifiers (URI): Generic Syntax"
2) RFC-4648 "The Base16, Base32, and Base64 Data Encodings" defines a
URL safe encoding for base64 data. It maps '+' to '-' and '/' to '_'
and either strips the padding or demands it be %-encoded as per
RFC-2396. The result is both URL and file name safe and is referred to
as base64url.
3) The Python base64 module has direct support for base64url (we should
be using it).
4) The current mapping of '/' to '-' is unfortunate because it
directly conflicts with the RFC-4648 base64url mapping. The alphabet
character '-' is supposed to represent index 62 not index 63, thus one
cannot augment the current mapping to comply with RFC-4648. Plus the
current mapping still isn't URL safe.
In OpenStack we should adhere to standards when they exist and not
invent a non-standard incomplete solution. If we use the RFC-4648
compliant mechanism we can then also call standard Python libraries to
perform base64url encode/decode operations.
Note, base64url is also safe as a value in HTTP headers.
The cms.cms_to_token() and cms.token_to_cms() should be re-implemented
to produce token id's which can be safely used in HTTP contexts as
well as using RFC defined base64 alphabets.
Since token lifetimes are quite short there shouldn't be backward
compatibility issues with previously issued tokens. A new token
utilizing the new token id format will issued.
Note:
base64url continues to use the '=' pad character which is NOT URL
safe. RFC-4648 suggests two alternate methods to deal with this.
percent-encode
percent-encode the pad character (e.g. '=' becomes
'%3D'). This makes the base64url text fully safe. But
percent-enconding has the downside of requiring
percent-decoding prior to feeding the base64url text into a
base64url decoder since most base64url decoders do not
recognize %3D as a pad character and most decoders require
correct padding.
no-padding
padding is not strictly necessary to decode base64 or
base64url text, the pad can be computed from the input text
length. However many decoders demand padding and will consider
non-padded text to be malformed. If one wants to omit the
trailing pad character(s) for use in URL's it can be added back.
For for token id use it we prefer strip the padding rather than
percent-encode the padding. This makes the token id slightly shorter
and cleaner.
</pre>
To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1233838/+subscriptions