← Back to team overview

kicad-developers team mailing list archive

Re: s-exp file read/write refactoring

 

In case it helps you, I wrote a parser and serialiser for the .kicad_mod
files in Haskell
<https://github.com/kasbah/haskell-kicad-data/blob/master/Data/Kicad/SExpr/Parse.hs>.
I test this against all the available GitHub libraries and it is also
property tested (with randomly generated data) for internal consistency
(i.e. writing and then parsing results in the same data as the one
written). Obviously you can’t add Haskell code to KiCad itself but it might
help in getting an overview or just some inspiration.
​

On 1 June 2015 at 20:27, Chris Pavlina <pavlina.chris@xxxxxxxxx> wrote:

> It's easy to understand, sure. My issue with it is things like this,
> from kicad_plugin.cpp:
>
> // line 514
> m_out->Print( aNestLevel+1, "(no_connects %d)\n",
> aBoard->GetUnconnectedNetCount() );
>
> We're writing out the file by hand, including all of the formatting such
> as parentheses, having to remember to manually escape things that need
> to be escaped, etc. Every single one of those is a place where one of
> these things can be forgotten. Some errors of that type, like forgetting
> to escape, can both go undetected for quite some time and possibly pose
> security risks.
>
> What I'd like is to have separate, data-agnostic and well tested
> s-expression code that reads in the file all at once into an internal
> data structure. Parsing everything correctly is handled at this point,
> and then all of the data can be trusted. The same structure can be
> written out by this s-expression code, which uses the same small routine
> for everything. This way, we ensure that the data is always written out
> correctly.
>
> It's a fairly simple change, really. Reading the data would even stay
> essentially the same: instead of asking the parser for a token, you're
> just traversing a data tree. The key is more in writing, where instead
> of printing out a formatted string to the file manually, you would add a
> string object to the structure - that string object, being the same as
> all the others, is guaranteed to know how to handle its data.
>
> A possible simpler route than an intermediate, in-memory structure would
> be to implement functions that perform all of these things, and call
> /them/ instead. For example, the above line 514 might become something
> like this:
>
> WriteSExpPair("no_connects", aBoard->GetUnconnectedNetCount();
>
> Where WriteSExpPair will write the parens correctly, validate
> "no_connects" to make sure there aren't accidentally any forbidden
> symbols typed in it, and would be overloaded to accept numbers, strings,
> etc. intelligently.
>
> (Even better would be if "no_connects" were a constant referred to by
> name instead of a literal, so the compiler could verify it...)
>
> On Mon, Jun 01, 2015 at 01:14:06PM -0400, Wayne Stambaugh wrote:
> > Chris,
> >
> > I'm not sure what your proposing.  This code is fairly well designed and
> > very easy to understand and modify which was one of the original design
> > goals.  The s-expr tokens are well defined and immutable for each file
> > format and are automatically generated from a text file containing a
> > list of the tokens.  The only thing that varies is parsing of none token
> > values such as numbers.  I need some examples of what changes you are
> > proposing before I would be willing to give you the go ahead.
> >
> > Cheers,
> >
> > Wayne
> >
> > On 6/1/2015 12:29 AM, Chris Pavlina wrote:
> > > Hi,
> > >
> > > I've been having a look through some of the file I/O code in pcbnew,
> and
> > > I'm not really comfortable with it. Despite the nice new structured
> > > s-exp format, we're still printing data out by hand and parsing it back
> > > in similarly - rather a recipe for trouble, and at least for
> > > inconsistent files. I'd like to put in some work after the release to
> > > refactor this into a self-contained, data-agnostic S-expression
> > > parser/writer library. This should make it easier to consistently
> > > continue using the format later in the planned eeschema work. Also, it
> > > should significant reduce the risk of errors being introduced in read
> > > and write, since everything involving "dirty" external data will be
> > > contained in one place.
> > >
> > > Anyone object to this, or have thoughts to contribute? Obviously it
> > > shall continue to read and write the exact same format, no format
> changes.
> > >
> > > --
> > > Chris
> > >
> > >
> > > _______________________________________________
> > > Mailing list: https://launchpad.net/~kicad-developers
> > > Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> > > Unsubscribe : https://launchpad.net/~kicad-developers
> > > More help   : https://help.launchpad.net/ListHelp
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~kicad-developers
> > Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~kicad-developers
> > More help   : https://help.launchpad.net/ListHelp
>
> _______________________________________________
> Mailing list: https://launchpad.net/~kicad-developers
> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~kicad-developers
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References