← Back to team overview

kicad-developers team mailing list archive

Re: 6.0 string proposal

 

On 5/3/19 5:22 AM, Jeff Young wrote:
> Hi Dick,
> 
>>> h) What is the list of deficiencies with current string usage?
> 
> I only have one issue with the current use of wxString, but it’s a big
> one: it crashes (unpredictably) when used multi-threaded in UTF8 mode.

I thought it was wxString itself that was not thread safe not
necessarily the utf-8 build but thread safety is the primary goal now
that we are using threads in multiple places within KiCad.

On my Debian system wx/setup.h shows

#define wxUSE_UNICODE 1

and

#define wxUSE_UNICODE_UTF8 0

so it would appear that wxString is built for unicode not utf8 mode on
linux.  I'm also pretty sure windows builds are unicode as well.

There is a secondary goal of removing wxWidgets from our low level
objects.  Maybe some day we can build the low level KiCad non-ui
libraries sans wxWdigets.  My thinking is that wxString should only come
into play at the UI level when dealing with wxWidgets UI code.  Being
able to use a standard C++ string implementation would (may?) go a long
way in helping with that goal.

> 
> This design document makes for fascinating
> reading: https://wiki.wxwidgets.org/Development:_UTF-8_Support.  It
> appears that the current wxString is at least in part modelled on QtString.
> 
> There’s also a bunch of interesting info
> here: https://docs.wxwidgets.org/trunk/overview_string.html, which I
> believe is more up-to-date than the previous link.  In particular,
> there’s the mention that wxString handles extra-BMP characters
> transparently when compiled in UTF8 mode (currently used by Kicad), but
> does NOT when compiled in default mode (in which case the app must
> handle surrogate pairs).  This of course directly leads to your point (d):
> 
>>>> d) What does the set of characters that don't fall into UCS2
>>>> actually look like?  How big
>>>> is this set, really?  (UTF16 is bigger than UCS2 and picks up the
>>>> difference.)
> 
> Do we really need to handle extra-BMP characters?
> 
> An even more recent version of the second document
> (https://docs.wxwidgets.org/trunk/classwx_string.html) finally makes an
> oblique reference to the multi-threading issue by starting with this
> (rather unhelpful) suggestion:
> 
> Note
>     While the use of wxString
>     <https://docs.wxwidgets.org/trunk/classwx_string.html> is
>     unavoidable in wxWidgets program, you are encouraged to use the
>     standard string classes |std::string| or |std::wstring| in your
>     applications and convert them to and from wxString
>     <https://docs.wxwidgets.org/trunk/classwx_string.html> only when
>     interacting with wxWidgets.
> 
> 
> Cheers,
> Jeff.
> 
> 
>> On 3 May 2019, at 02:03, Dick Hollenbeck <dick@xxxxxxxxxxx
>> <mailto:dick@xxxxxxxxxxx>> wrote:
>>
>> On 5/2/19 5:32 PM, Dick Hollenbeck wrote:
>>> On 4/30/19 4:36 AM, Jeff Young wrote:
>>>> We had talked earlier about throwing the wxWidgets UTF8 compile
>>>> switch to get rid of our wxString re-entrancy problems.  However, I
>>>> noticed that the 6.0 work packages doc includes an item for
>>>> std::string-ization of the BOARD.  (While a lot more work, this is a
>>>> better solution because it also increases our gui-toolkit-choice
>>>> flexibility.)
>>>>
>>>> I’d like to propose that we use std::wstring for that.  UTF8 should
>>>> *only* be an encoding format (similar to s-expr).  It should never
>>>> be used internally. That’s what unicode wchar_t’s are for.
>>>>
>>>> And I’d like to propose that we extend std::wstring-ization to
>>>> SCH_ITEM and LIB_ITEM.  (Then we can get rid of a bunch of our ugly
>>>> mutex hacks.)
>>>
>>>
>>> I've been looking at this for a few months now.  I think it is so
>>> important, that a
>>> sub-committee should be formed, and if that committee takes as long
>>> as 4 months to come to
>>> a recommendation, this would not be too long.  This issue is simply
>>> too critical.
>>>
>>> I would like to volunteer to be on that committee.  For the entire
>>> list to participate in
>>> this simply does not make sense to me.  I would welcome the
>>> opportunity to study this with
>>> a team of 5-6 players.  More than that probably leads to anxiety.
>>>  Then, given the
>>> recommendations, the list would of course have an opportunity to
>>> raise questions and take
>>> shots, before a strategy is formulated, and before anything is
>>> implemented.
>>>
>>> Again, approximately:
>>>
>>>  committee recommendations -> list approval -> strategy formulation
>>> -> implementation
>>>
>>>
>>> Up to now I have looked at many libraries and have [way *too* much]
>>> experience in multiple
>>> languages on multiple platforms, so I think I can be valuable
>>> contributor.
>>>
>>> The final work product initially would simply be a list of
>>> recommendations, that quickly
>>> transforms to a strategy thereafter.  This is an enormous
>>> undertaking, so I suggest
>>> against racing to a solution.  It could look a lot easier than it
>>> will ultimately be, as
>>> is typical in software development.  But the return on investment
>>> needs to be near optimal
>>> in the end.
>>>
>>> Some questions to answer are:
>>>
>>> a) How did wxString get to its current state?  Is is merely a
>>> conglomeration of after
>>> thought, or is is anywhere near optimal.
>>>
>>> b) Why so many forms of it?  Can one form be chosen for all platforms?
>>>
>>> c) How does wxString it compare to QtString?
>>>
>>> d) What does the set of characters that don't fall into UCS2 actually
>>> look like?  How big
>>> is this set, really?  (UTF16 is bigger than UCS2 and picks up the
>>> difference.)
>>>
>>> e) For data files, I think UTF8 is fine.  So the change is for RAM
>>> manipulation of
>>> strings.  Aren't we talking about a RAM resident string that bridges
>>> into the GUI seamlessly?
>>>
>>> f) What does new C++ language support offer?
>>>
>>> g) What do C++ language designers suggest?
>>
>> h) What is the list of deficiencies with current string usage?
>>
>>
>>>
>>>
>>> etc.
>>>
>>> But this is best continued in a smaller group, as said.
>>>
>>>
>>> The other thing that I bring to this is vast familiarity with KiCad's
>>> internal workings,
>>> string use cases, and goals.
>>>
>>> Let me know if I can help.
>>>
>>> Regards,
>>>
>>> Dick
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~kicad-developers
>>> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
>>> <mailto:kicad-developers@xxxxxxxxxxxxxxxxxxx>
>>> Unsubscribe : https://launchpad.net/~kicad-developers
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~kicad-developers
>> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
>> <mailto:kicad-developers@xxxxxxxxxxxxxxxxxxx>
>> Unsubscribe : https://launchpad.net/~kicad-developers
>> More help   : https://help.launchpad.net/ListHelp
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~kicad-developers
> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~kicad-developers
> More help   : https://help.launchpad.net/ListHelp
> 


Follow ups

References