Thread Previous • Date Previous • Date Next • Thread Next |
Am 2019-04-30 12:22, schrieb John Beard:
Even with UTF-32, you can only do an O(1) lookup of the n'th *code point* or *code unit* (the same in UTF-32, not in UTF-8), not the n'th *encoded character*.
+1 here. I'd be in favor of standardizing on a clean, standards-compliant string library for internal work, converting to wxString only for user interaction.
My main beef with UTF-16 (and UTF-32) is that they don't display as human readable files without a UTF-16/UTF-32 compatible viewer. All of our file formatting is ASCII with the exception of user-generated content. So, right now, I can use any text viewer to read the files. Using UTF-8 preserves this ability but we'd lose this with u32string (unless we convert back for writing)
There are some other, minor issues including byte-order marking, corruption re-syncronization and external library support that we'd need to think closely about if we wanted to change.
PS / OT: If we had to optimise one thing, PolygonTriangulation::Vertex::inTriangle is the single hungriest function, chewing 6.19% of all CPU time, double that of each of the next 3: __gnu_cxx::__exchange_and_add (2.76%), PolygonTriangulation::isEar (2.73%) and even malloc (2.27%).
FYI, I am currently working on modifying the triangulation. -Seth
Thread Previous • Date Previous • Date Next • Thread Next |