kicad-developers team mailing list archive
-
kicad-developers team
-
Mailing list archive
-
Message #10683
*.kicad_pcb file loading speedup success
For months I have been wanting to experiment with DSNLEXER::findToken() and see if there
were faster techniques than the binary search of C strings that I initially used. I had a
hunch that a hashtable might be faster.
After bringing in boost unordered_map< std::string, int > and instrumenting files.cpp to
tell how many microseconds a PLUGIN::Load() would take, the results were surprising to me.
The moral of this story is never guess what you can measure, unless you are willing to be
wrong. I was able to measure before and after results.
The hashtable unordered_map< std::string ..> was slower than the bsearch(). Then I
swapped out the hash function with a custom one. This was nearly identical in results.
About the time I was ready then to yank out the experimental code, i.e. give up, I said
"well why not try one last thing". How about a tricked out hashtable that uses C strings
rather than std::string, and a hashfunction that works directly on C string and an
equality test that distills down to C's strcmp()?
The result was again surprising. I bagged a 13% reduction/speedup in overall *.kicad_pcb
file loading times on larger files. Note that this was a specialized hashtable that I
basically invented, you will probably not find this in books. Because if you pass "const
char*" pointer to a boost hashtable, you basically get const char* (32 or 64 bits) being
treated as a large integer, and no de-referencing is made to the actual C string. Of
course the reason is that no storage is provided for the C string within the hashtable.
Yours truly was rewarded with a little experimentation and out of the box thinking.
I cannot image how much faster this technique is relative to DSNLEXER::findToken()
considered alone. What I am reporting is the effect on all of board loading time, so you
know that findToken() has to be enormously much faster now than the 13% effect on over all
load time.
Regards,
Dick
P.S.
The bad news is that *.kicad_pcb file loading is still slower than legacy. And it will
probably always be so. But we are now within ball park, so don't let it bother you, it
won't bother me. I gave it my best, and don't think there are any more low hanging fruit
to reap here.
A certain portion of this gain, say about 20% of the total gain, was due to giving up case
independent compare. Now your keywords must match, you don't get case independence on
token matching. (I forgot that was even in there, and I don't think any s-expression
files became dependent on upper case keywords, since we complain in CMake about them in
the grammar files.)
If it breaks anything, yell.