← Back to team overview

kicad-developers team mailing list archive

Re: RICHIO performance - 3 to 30 times slower than std::ifstream

 

Hi Wayne,

I added some new profiles for the INPUTSTREAM_LINE_READER.

The results are very surprising to me. In debug and release mode,
using INPUTSTREAM_LINE_READER with a wxInputFileStream is around 200
times (:-O) slower than a straight std::ifstream, taking over two
seconds to read a 6.5MB short-lined file that std::ifstream can do in
<10ms.

I wonder if there's something I've missed here, as I can't believe
it's truly that slow.

I've pushed the benchmark to Launchpad for those who are interested:

https://code.launchpad.net/~john-j-beard/kicad/+git/kicad/+ref/io_benchmark

As for your note about having a generic stream version, yes, that's
more flexible and we should aim for that, if we were to provide a
std::istream LINE_READER. I just did ifstream as a test to keep things
clear and ensure a sensible comparison.

As I said, the current performance is "OK", and if we want to limit
line lengths, we probably can't get that for free, anyway.

I understand the desire to not read infinite lines, but at least in my
tests, the std:ifstream method, which has no limit for that, can deal
with a 1GB file of a single line in about 300ms. Obviously it's all in
disk cache, and you have to pay the allocation for it when reading
into the buffer.

All the existing LINE_READER explode with IO_ERROR on that file since
it's too long for them.

Cheers,

John

On Fri, Feb 17, 2017 at 5:56 AM, Wayne Stambaugh <stambaughw@xxxxxxxxx> wrote:
> John,
>
> It would have been nice if you would have benchmarked wxFileInputStream
> as well.  There already is an INPUTSTREAM_LINE_READER object which takes
> a pointer to wxInputStream object.  I'm curious how it stacks up against
> the std::ifstream.  There are some interesting wxInputStream objects
> that could prove useful.
>
> I think ifstream wasn't used in case there are really long lines which
> there can be if you have text objects with lots of long multiple line
> strings in your files.  I'm ok with adding a LINE_READER the wraps
> istream objects.  It's fairly trivial to change LINE_READER types.  It
> might be a bit more flexible if you just provided an ISTREAM_LINE_READER
> that take any istream derived object rather than write a separate
> LINE_READER for each istream derivative.
>
> Cheers,
>
> Wayne
>
> On 2/16/2017 8:43 AM, John Beard wrote:
>> Hi,
>>
>> I was trying to profile the eeschema slow library loads, and I got a
>> bit distracted by RICHIO's FILE_LINE_READER.
>>
>> Internally, it uses a very tight loop of reading single chars at a
>> time from a file descriptor, which looks inefficient. I wrote a
>> benchmarker to compare RICHIO against std::ifstream and a new
>> LINE_READER implementation, backed by std::ifstream. operf confirms
>> that most of the time in RICHIO burned in the ReadLine() function
>> itself.
>>
>> The results were that RICHIO (in debug mode) is consistently 4-7 times
>> slower than using std::ifstream, when reading eeschema library text
>> files (so relatively short lines). Compiling the release version
>> improved RICHIOs speed more than std::ifstream's, but it is still
>> around 3 times slower than std::ifstream.
>>
>> For files with 1k lines, the slowdown is about 30 times (!) in debug
>> and 14 times in release mode, so significantly worse. Few files read
>> line-wise by Kicad look like that, however.
>>
>> Avoiding reconstructing the stream/LINE_READER each time doesn't have
>> much of an effect in any case.
>>
>> Is there a particular reason why STL streams are not used in RICHIO?
>> The only thing I think the example ifstream implementation can't do is
>> catching over long lines, but that's only used in one place: the VRML
>> parser, which hardcodes an 8MB limit. ifstream could do this, but not
>> with the simple getline function.
>>
>> This performance doesn't appear to be a major bottleneck for me, but
>> it does seem a shame to throw away (charitably) two thirds of file
>> read speeds (and uncharitably, up to 97% in odd cases) if there is no
>> particular reason to do so.
>>
>> As an aside, RICHIO appears to allocate twice as many times as
>> std::ifstream when reading the same data, for the roughly the same
>> amount of memory in total.
>>
>> Anyway, I thought I'd share this finding! Please find attached the
>> benchmark program, such as it is.
>>
>> Cheers,
>>
>> John
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~kicad-developers
>> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~kicad-developers
>> More help   : https://help.launchpad.net/ListHelp
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~kicad-developers
> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~kicad-developers
> More help   : https://help.launchpad.net/ListHelp


Follow ups

References