kicad-developers team mailing list archive
Mailing list archive
Re: 6.0 string proposal
String access is a factor in the performance of the new real-time
connectivity algorithm in eeschema, since all connectivity is established
by parsing labels and pin names. I have not done benchmarks comparing
various options for string storage, but we would need to watch that space
too if we change how strings work.
On Tue, Apr 30, 2019 at 8:41 PM John Beard <john.j.beard@xxxxxxxxx> wrote:
> On 30/04/2019 16:01, Jeff Young wrote:
> > Primarily for performance reasons.
> WRT performance, I did a few benchmarks for reference (on Linux)
> Loading this large CIAA PCB allocates, out of a peak usage of 467MB
> of heap with a 0.01% threshold:
> * 9.6MB of std::basic_string<wchar_t>::_M_assign
> * 9.4MB of this is from wxString operator= assignments
> * ~600kB of std::basic_string<wchar_t>::_M_construct, (wxString ctor)
> So I'm not sure memory usage is a major factor to worry about (strings
> allocate storage on the heap, so we should see basically all the
> interesting things in the heap profile). UTF-8 could be as little as 1/4
> UTF-32 (all strings are ASCII), but even then, it's a few MB saved.
> Now, in terms of performance, opening Pcbnew with no file gives:
> #4 3.36% __gconv_transform_utf8_internal
> #5 2.51% __mbsrtowcs_l
> #6 2.50% wxMBConv::ToWChar
> #8 2.07% std::basic_string<wxhar_t>::_M_assign
> #9 1.88% wxMBConvStrictUTF8::ToWChar
> #14 1.27% EscapeString (kicad function)
> #17 0.85% __GI___strlen_sse2
> #18 0.85% wxUniChar::From8bit
> #19 0.84% wxUniChar::operator==
> And plenty more string-y things in the top 50 or so lines. So it seems
> the biggest cost for strings is converting them from UTF-8 to wchar_t
> strings in WX (this is probably not the same on Windows). But it's not
> really a stunning cost.
> However, loading the CIAA board, and there are basically no string
> operations above 0.5%, and only a handful even above 0.25%. When doing
> DRC, strings don't break 0.1%: nearly all the significant work is
> looking things up in std::maps and geometry.
> So string performance doesn't seem to be *that* critical, as it's
> quickly drowned out under real workloads. It looks to me (and I'm happy
> to be corrected, I'm not a perf expert), like string operations in KiCad
> are not much of a bottleneck.
> > Because characters are different lengths, you have to scan the string
> > to find the n’th character.
> Even with UTF-32, you can only do an O(1) lookup of the n'th *code
> point* or *code unit* (the same in UTF-32, not in UTF-8), not the n'th
> *encoded character*.
> That's true even if you normalise the strings first. Not all code points
> map one-to-one to an encoded character (it can be one-to-none,
> one-to-one, many-to-one). And that's even without considering grapheme
> PS / OT: If we had to optimise one thing,
> PolygonTriangulation::Vertex::inTriangle is the single hungriest
> function, chewing 6.19% of all CPU time, double that of each of the next
> 3: __gnu_cxx::__exchange_and_add (2.76%), PolygonTriangulation::isEar
> (2.73%) and even malloc (2.27%).
> Other than that fairly mundane 6%-er, there are no eye-popping
> performance hogs simply on loading a PCB. Which is nice.
> Mailing list: https://launchpad.net/~kicad-developers
> Post to : kicad-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~kicad-developers
> More help : https://help.launchpad.net/ListHelp