← Back to team overview

kicad-developers team mailing list archive

Re: 6.0 string proposal

 

Hi Dick,

>> h) What is the list of deficiencies with current string usage?

I only have one issue with the current use of wxString, but it’s a big one: it crashes (unpredictably) when used multi-threaded in UTF8 mode.

This design document makes for fascinating reading: https://wiki.wxwidgets.org/Development:_UTF-8_Support.  It appears that the current wxString is at least in part modelled on QtString.

There’s also a bunch of interesting info here: https://docs.wxwidgets.org/trunk/overview_string.html, which I believe is more up-to-date than the previous link.  In particular, there’s the mention that wxString handles extra-BMP characters transparently when compiled in UTF8 mode (currently used by Kicad), but does NOT when compiled in default mode (in which case the app must handle surrogate pairs).  This of course directly leads to your point (d):

>>> d) What does the set of characters that don't fall into UCS2 actually look like?  How big
>>> is this set, really?  (UTF16 is bigger than UCS2 and picks up the difference.)

Do we really need to handle extra-BMP characters?

An even more recent version of the second document (https://docs.wxwidgets.org/trunk/classwx_string.html) finally makes an oblique reference to the multi-threading issue by starting with this (rather unhelpful) suggestion:
Note
While the use of wxString <https://docs.wxwidgets.org/trunk/classwx_string.html> is unavoidable in wxWidgets program, you are encouraged to use the standard string classes std::string or std::wstring in your applications and convert them to and from wxString <https://docs.wxwidgets.org/trunk/classwx_string.html> only when interacting with wxWidgets.

Cheers,
Jeff.


> On 3 May 2019, at 02:03, Dick Hollenbeck <dick@xxxxxxxxxxx> wrote:
> 
> On 5/2/19 5:32 PM, Dick Hollenbeck wrote:
>> On 4/30/19 4:36 AM, Jeff Young wrote:
>>> We had talked earlier about throwing the wxWidgets UTF8 compile switch to get rid of our wxString re-entrancy problems.  However, I noticed that the 6.0 work packages doc includes an item for std::string-ization of the BOARD.  (While a lot more work, this is a better solution because it also increases our gui-toolkit-choice flexibility.)
>>> 
>>> I’d like to propose that we use std::wstring for that.  UTF8 should *only* be an encoding format (similar to s-expr).  It should never be used internally. That’s what unicode wchar_t’s are for.
>>> 
>>> And I’d like to propose that we extend std::wstring-ization to SCH_ITEM and LIB_ITEM.  (Then we can get rid of a bunch of our ugly mutex hacks.)
>> 
>> 
>> I've been looking at this for a few months now.  I think it is so important, that a
>> sub-committee should be formed, and if that committee takes as long as 4 months to come to
>> a recommendation, this would not be too long.  This issue is simply too critical.
>> 
>> I would like to volunteer to be on that committee.  For the entire list to participate in
>> this simply does not make sense to me.  I would welcome the opportunity to study this with
>> a team of 5-6 players.  More than that probably leads to anxiety.  Then, given the
>> recommendations, the list would of course have an opportunity to raise questions and take
>> shots, before a strategy is formulated, and before anything is implemented.
>> 
>> Again, approximately:
>> 
>>  committee recommendations -> list approval -> strategy formulation -> implementation
>> 
>> 
>> Up to now I have looked at many libraries and have [way *too* much] experience in multiple
>> languages on multiple platforms, so I think I can be valuable contributor.
>> 
>> The final work product initially would simply be a list of recommendations, that quickly
>> transforms to a strategy thereafter.  This is an enormous undertaking, so I suggest
>> against racing to a solution.  It could look a lot easier than it will ultimately be, as
>> is typical in software development.  But the return on investment needs to be near optimal
>> in the end.
>> 
>> Some questions to answer are:
>> 
>> a) How did wxString get to its current state?  Is is merely a conglomeration of after
>> thought, or is is anywhere near optimal.
>> 
>> b) Why so many forms of it?  Can one form be chosen for all platforms?
>> 
>> c) How does wxString it compare to QtString?
>> 
>> d) What does the set of characters that don't fall into UCS2 actually look like?  How big
>> is this set, really?  (UTF16 is bigger than UCS2 and picks up the difference.)
>> 
>> e) For data files, I think UTF8 is fine.  So the change is for RAM manipulation of
>> strings.  Aren't we talking about a RAM resident string that bridges into the GUI seamlessly?
>> 
>> f) What does new C++ language support offer?
>> 
>> g) What do C++ language designers suggest?
> 
> h) What is the list of deficiencies with current string usage?
> 
> 
>> 
>> 
>> etc.
>> 
>> But this is best continued in a smaller group, as said.
>> 
>> 
>> The other thing that I bring to this is vast familiarity with KiCad's internal workings,
>> string use cases, and goals.
>> 
>> Let me know if I can help.
>> 
>> Regards,
>> 
>> Dick
>> 
>> 
>> _______________________________________________
>> Mailing list: https://launchpad.net/~kicad-developers
>> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~kicad-developers
>> More help   : https://help.launchpad.net/ListHelp
>> 
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~kicad-developers <https://launchpad.net/~kicad-developers>
> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx <mailto:kicad-developers@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~kicad-developers <https://launchpad.net/~kicad-developers>
> More help   : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>

Follow ups

References