← Back to team overview

launchpad-dev team mailing list archive

Unicode and Launchpad

 

Dear hackers,

There have been a few incidents lately where the code has blown up with 
UnicodeDecodeError.  We were talking about this on the Team Leads weekly call 
yesterday, and I have been tasked with starting a thread to discuss how we can 
mitigate these errors in the future.

The things we discussed on the call were fairly simple:

 * Keep all strings as unicode internally (with the exception of plain ASCII 
strings which are easily coerced to unicode automatically)
 * Convert to/from unicode only when necessary (e.g. utf8 byte string or MIME) 
at the point the string *exits or enters* Launchpad.
 * Never use str()
 * Whenever someone is dealing with strings in a branch, please review 
accordingly.

I'm going to throw this open to you guys now to see if anyone else has 
anything valuable to contribute.  I'll then summarise everything into our 
coding guidelines after the discussion ends.

Cheers.


Follow ups