kicad-developers team mailing list archive
-
kicad-developers team
-
Mailing list archive
-
Message #27899
RICHIO performance - 3 to 30 times slower than std::ifstream
Hi,
I was trying to profile the eeschema slow library loads, and I got a
bit distracted by RICHIO's FILE_LINE_READER.
Internally, it uses a very tight loop of reading single chars at a
time from a file descriptor, which looks inefficient. I wrote a
benchmarker to compare RICHIO against std::ifstream and a new
LINE_READER implementation, backed by std::ifstream. operf confirms
that most of the time in RICHIO burned in the ReadLine() function
itself.
The results were that RICHIO (in debug mode) is consistently 4-7 times
slower than using std::ifstream, when reading eeschema library text
files (so relatively short lines). Compiling the release version
improved RICHIOs speed more than std::ifstream's, but it is still
around 3 times slower than std::ifstream.
For files with 1k lines, the slowdown is about 30 times (!) in debug
and 14 times in release mode, so significantly worse. Few files read
line-wise by Kicad look like that, however.
Avoiding reconstructing the stream/LINE_READER each time doesn't have
much of an effect in any case.
Is there a particular reason why STL streams are not used in RICHIO?
The only thing I think the example ifstream implementation can't do is
catching over long lines, but that's only used in one place: the VRML
parser, which hardcodes an 8MB limit. ifstream could do this, but not
with the simple getline function.
This performance doesn't appear to be a major bottleneck for me, but
it does seem a shame to throw away (charitably) two thirds of file
read speeds (and uncharitably, up to 97% in odd cases) if there is no
particular reason to do so.
As an aside, RICHIO appears to allocate twice as many times as
std::ifstream when reading the same data, for the roughly the same
amount of memory in total.
Anyway, I thought I'd share this finding! Please find attached the
benchmark program, such as it is.
Cheers,
John
From 9030890b9ad86ff2601a0488ef6e0003dfbb2e66 Mon Sep 17 00:00:00 2001
From: John Beard <john.j.beard@xxxxxxxxx>
Date: Wed, 15 Feb 2017 16:47:41 +0800
Subject: [PATCH] IO benchmark program: RICHIO vs std::fstream
Also vs fstream-backed LINE_READER.
---
qa/CMakeLists.txt | 2 +
qa/io_benchmark/CMakeLists.txt | 17 ++++
qa/io_benchmark/fstream_io.cpp | 77 +++++++++++++++
qa/io_benchmark/fstream_io.h | 55 +++++++++++
qa/io_benchmark/io_benchmark.cpp | 199 +++++++++++++++++++++++++++++++++++++++
5 files changed, 350 insertions(+)
create mode 100644 qa/io_benchmark/CMakeLists.txt
create mode 100644 qa/io_benchmark/fstream_io.cpp
create mode 100644 qa/io_benchmark/fstream_io.h
create mode 100644 qa/io_benchmark/io_benchmark.cpp
diff --git a/qa/CMakeLists.txt b/qa/CMakeLists.txt
index bd579a597..a56f15f51 100644
--- a/qa/CMakeLists.txt
+++ b/qa/CMakeLists.txt
@@ -9,3 +9,5 @@ if( KICAD_SCRIPTING_MODULES )
)
endif()
+
+add_subdirectory( io_benchmark )
diff --git a/qa/io_benchmark/CMakeLists.txt b/qa/io_benchmark/CMakeLists.txt
new file mode 100644
index 000000000..fde89c9d2
--- /dev/null
+++ b/qa/io_benchmark/CMakeLists.txt
@@ -0,0 +1,17 @@
+
+include_directories( BEFORE ${INC_BEFORE} )
+
+set( IOBENCHMARK_SRCS
+ io_benchmark.cpp
+ fstream_io.cpp
+)
+
+add_executable( io_benchmark
+ ${IOBENCHMARK_SRCS}
+)
+
+target_link_libraries( io_benchmark
+ common
+ ${wxWidgets_LIBRARIES}
+)
+
diff --git a/qa/io_benchmark/fstream_io.cpp b/qa/io_benchmark/fstream_io.cpp
new file mode 100644
index 000000000..420f57348
--- /dev/null
+++ b/qa/io_benchmark/fstream_io.cpp
@@ -0,0 +1,77 @@
+/*
+ * This program source code file is part of KiCad, a free EDA CAD application.
+ *
+ * Copyright (C) 1992-2017 KiCad Developers, see AUTHORS.txt for contributors.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you may find one here:
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.html
+ * or you may search the http://www.gnu.org website for the version 2 license,
+ * or you may write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+#include "fstream_io.h"
+
+#include <ios>
+
+FSTREAM_LINE_READER::FSTREAM_LINE_READER( const wxString& aFileName,
+ unsigned aStartingLineNumber,
+ unsigned aMaxLineLength ) throw( IO_ERROR ) :
+ LINE_READER( aMaxLineLength ),
+ m_stream( aFileName )
+{
+ line = &m_buffer[0];
+
+ if( !m_stream.is_open() )
+ {
+ wxString msg = wxString::Format(
+ _( "Unable to open filename '%s' for reading" ), aFileName.GetData() );
+ THROW_IO_ERROR( msg );
+ }
+
+ source = aFileName;
+ lineNum = aStartingLineNumber;
+}
+
+FSTREAM_LINE_READER::~FSTREAM_LINE_READER()
+{
+ // this is only a view into a a string, it cant be deleted by the base
+ line = nullptr;
+}
+
+
+char* FSTREAM_LINE_READER::ReadLine() throw( IO_ERROR )
+{
+ if( getline( m_stream, m_buffer ) )
+ {
+ length = m_buffer.size();
+ line = (char*) m_buffer.data(); //ew why no const??
+ }
+ else
+ {
+ length = 0;
+ }
+
+ // lineNum is incremented even if there was no line read, because this
+ // leads to better error reporting when we hit an end of file.
+ ++lineNum;
+
+ return length ? line : nullptr;
+}
+
+void FSTREAM_LINE_READER::Rewind()
+{
+ m_stream.clear() ;
+ m_stream.seekg(0, std::ios::beg );
+}
diff --git a/qa/io_benchmark/fstream_io.h b/qa/io_benchmark/fstream_io.h
new file mode 100644
index 000000000..dbb5d7cab
--- /dev/null
+++ b/qa/io_benchmark/fstream_io.h
@@ -0,0 +1,55 @@
+/*
+ * This program source code file is part of KiCad, a free EDA CAD application.
+ *
+ * Copyright (C) 2017 KiCad Developers, see AUTHORS.txt for contributors.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you may find one here:
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.html
+ * or you may search the http://www.gnu.org website for the version 2 license,
+ * or you may write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+#ifndef FSTREAM_LINE_READER_H
+#define FSTREAM_LINE_READER_H
+
+#include <richio.h>
+
+#include <fstream>
+
+class FSTREAM_LINE_READER : public LINE_READER
+{
+public:
+
+ FSTREAM_LINE_READER( const wxString& aFileName,
+ unsigned aStartingLineNumber = 0,
+ unsigned aMaxLineLength = LINE_READER_LINE_DEFAULT_MAX ) throw( IO_ERROR );
+
+ ~FSTREAM_LINE_READER();
+
+ char* ReadLine() throw( IO_ERROR ) override;
+
+ /**
+ * Function Rewind
+ * rewinds the file and resets the line number back to zero. Line number
+ * will go to 1 on first ReadLine().
+ */
+ void Rewind();
+
+private:
+ std::string m_buffer;
+ std::ifstream m_stream;
+};
+
+#endif // FSTREAM_LINE_READER_H
diff --git a/qa/io_benchmark/io_benchmark.cpp b/qa/io_benchmark/io_benchmark.cpp
new file mode 100644
index 000000000..a0035ee92
--- /dev/null
+++ b/qa/io_benchmark/io_benchmark.cpp
@@ -0,0 +1,199 @@
+
+#include <wx/wx.h>
+#include <richio.h>
+
+#include <chrono>
+#include <ios>
+#include <functional>
+
+#include <fstream>
+
+#include "fstream_io.h"
+
+enum RET_CODES
+{
+ BAD_ARGS = 1,
+};
+
+using CLOCK = std::chrono::steady_clock;
+using TIME_PT = std::chrono::time_point<CLOCK>;
+
+static int getMs( const TIME_PT& start, const TIME_PT& end )
+{
+ using std::chrono::milliseconds;
+ using std::chrono::duration_cast;
+
+ milliseconds diff = duration_cast<milliseconds>( end - start );
+
+ return diff.count();
+}
+
+
+struct BENCH_REPORT
+{
+ int linesRead;
+ int charAcc;
+};
+
+using BENCH_FUNC = std::function<void(const wxString&, int, BENCH_REPORT&)>;
+
+struct BENCHMARK
+{
+ BENCH_FUNC func;
+ wxString name;
+};
+
+
+static void bench_fstream( const wxString& aFile, int aReps, BENCH_REPORT& report )
+{
+ std::string line;
+
+ for( int i = 0; i < aReps; ++i)
+ {
+ std::ifstream fstr( aFile );
+
+ while( getline( fstr, line ) )
+ {
+ report.linesRead++;
+ report.charAcc += line[0];
+ }
+
+ fstr.close();
+ }
+}
+
+static void bench_fstream_reuse( const wxString& aFile, int aReps, BENCH_REPORT& report )
+{
+ std::string line;
+ std::ifstream fstr( aFile );
+
+ for( int i = 0; i < aReps; ++i)
+ {
+ while( getline( fstr, line ) )
+ {
+ report.linesRead++;
+ report.charAcc += line[0];
+ }
+ fstr.clear() ;
+ fstr.seekg(0, std::ios::beg) ;
+ }
+
+ fstr.close();
+}
+
+static void bench_richio( const wxString& aFile, int aReps, BENCH_REPORT& report )
+{
+ for( int i = 0; i < aReps; ++i)
+ {
+ FILE_LINE_READER flr( aFile );
+ while( flr.ReadLine() )
+ {
+ report.linesRead++;
+ report.charAcc += flr.Line()[0];
+ }
+ }
+}
+
+static void bench_richio_reuse( const wxString& aFile, int aReps, BENCH_REPORT& report )
+{
+ FILE_LINE_READER flr( aFile );
+ for( int i = 0; i < aReps; ++i)
+ {
+ while( flr.ReadLine() )
+ {
+ report.linesRead++;
+ report.charAcc += flr.Line()[0];
+ }
+
+ flr.Rewind();
+ }
+}
+
+static void bench_fstream_richio( const wxString& aFile, int aReps, BENCH_REPORT& report )
+{
+ for( int i = 0; i < aReps; ++i)
+ {
+ FSTREAM_LINE_READER fstr( aFile );
+ while( fstr.ReadLine() )
+ {
+ report.linesRead++;
+ report.charAcc += fstr.Line()[0];
+ }
+ }
+}
+
+static void bench_fstream_richio_reuse( const wxString& aFile, int aReps, BENCH_REPORT& report )
+{
+ FSTREAM_LINE_READER fstr( aFile );
+ for( int i = 0; i < aReps; ++i)
+ {
+
+ while( fstr.ReadLine() )
+ {
+ report.linesRead++;
+ report.charAcc += fstr.Line()[0];
+ }
+
+ fstr.Rewind();
+ }
+}
+
+int main( int argc, char* argv[] )
+{
+ if (argc < 3)
+ {
+ wxLogDebug("Usage: %s <FILE> <REPS> [r/f/R/F]", argv[0] );
+ return BAD_ARGS;
+ }
+
+ wxString inFile( argv[1] );
+
+ long reps = 0;
+ wxString( argv[2] ).ToLong( &reps );
+
+ // get the benchmark to do, or 0 for all of them
+ char bench = 0;
+ if ( argc == 4 )
+ bench = argv[3][0];
+
+ wxLogDebug("IO Bench Mark Util");
+
+ wxLogDebug( " Benchmark file: %s", inFile );
+ wxLogDebug( " Repetitions: %d", (int) reps );
+
+ std::vector<BENCHMARK> benches;
+
+
+ if( !bench || bench == 'f' )
+ benches.push_back( { bench_fstream, "std::fstream" } );
+
+ if( !bench || bench == 'F' )
+ benches.push_back( { bench_fstream_reuse, "std::fstream, reused" } );
+
+ if ( !bench || bench == 'r' )
+ benches.push_back( { bench_richio, "RICHIO" } );
+
+ if ( !bench || bench == 'R' )
+ benches.push_back( { bench_richio_reuse, "RICHIO, reused" } );
+
+ if ( !bench || bench == 'n' )
+ benches.push_back( { bench_fstream_richio, "New fstream IO" } );
+
+ if ( !bench || bench == 'N' )
+ benches.push_back( { bench_fstream_richio_reuse, "New fstream IO, reused" } );
+
+
+ for( auto& bmark : benches )
+ {
+ BENCH_REPORT report = {};
+
+ TIME_PT start = CLOCK::now();
+ bmark.func( inFile, reps, report );
+ TIME_PT end = CLOCK::now();
+
+ wxLogDebug( "%25s %d lines, acc: %d in %d ms",
+ bmark.name, report.linesRead, report.charAcc, getMs( start, end ) );
+ }
+
+ return 0;
+}
--
2.11.0
Follow ups