← Back to team overview

launchpad-dev team mailing list archive

Re: SOA object ids

 

On Tue, Dec 13, 2011 at 4:25 PM, Abel Deuring
<abel.deuring@xxxxxxxxxxxxx> wrote:
> On 13.12.2011 10:19, Abel Deuring wrote:
>> On 13.12.2011 07:26, Robert Collins wrote:
>>
>>> We could say 'utf8' and leave it at that. Or we could say 'the
>>> printable subset of ascii' or some such. I'd just say non whitespace
>>> utf8, as strings are easier to deal with, and avoiding whitespace
>>> avoids most likely encoding issues.
>>
>> No arbitrary utf8/utf16 or anything non-ASCII please, or we may have
>> funny things like the attached script.

Agree. There are too many issues with utf8 and if anything actually
starts making use of that space things will explode. In particular,
languages will handle normalization differently so you will need to
end up dealing with byte strings in any case and the identifiers may
not round trip - you would need to encode the string to ascii before
using it as a key in a database or a filename or pasting it into a
document for instance if you expect the byte string you put in to
match the byte string you get back out.

They also will need to be human consumable, so I'd restrict each chunk
to being [a-zA-Z0-9]+ delimited by a well known token, with an
explicit instance. "service-instance-type-id" (delimiter '-' chosen as
most systems will consider the whole thing 'a word' such as double
clicking on it in Gnome terminal). I'm also tempted to say case
insensitive or lowercase only.

-- 
Stuart Bishop <stuart.bishop@xxxxxxxxxxxxx>


Follow ups

References