← Back to team overview

kicad-developers team mailing list archive

Re: [rfc] actual sexpression parsing

 

Mark,

I haven't had time to look at your commit so I'm not going to comment on
that until I do.  I don't know when I'll have time as I have a lot of
stuff to do and I'm traveling over the holidays so my review time will
be limited.  In the mean time I will comment on some of the things I've
read in this thread.

Binary file formats are not going happen while I'm project leader. I
have 30 years of less than positive experience with them and I'm not
about to start using them now.

Be careful with the C++ << operators. They may not do the correct thing
with floating point numbers. There may also be some overhead that the
current design does not have.  I will not accept a performance hit on
file parsing or any significant increase in formatted file size.

On 12/17/2015 10:32 PM, Mark Roszko wrote:
> So awhile back, Wayne said to use sexpr for something I wanted to do.
> Then I looked at the sexpr parsing and said NOPE.
> 
> 
> Why NOPE? Because the current parsing regime is basically Visual Basic
> 6 parser written in a modern language with micro-optimizations meant
> for someone running a Windows 3.1 computer with 5.25" storage drives.
> 
> 
> This patch is a propsed sexpr parser that parses sepxr like its sexpr
> and not a parenthesis format. Because if you are parsing with things
> like NeedLeft() and NeedRight(), you are parsing it wrong.
> 
> Especially when you see something like this:
> 
>             case T_descr:
>                 if( sawDesc )
>                     in->Duplicate( tok );
>                 sawDesc = true;
>                 in->NeedSYMBOLorNUMBER();
>                 row.SetDescr( in->FromUTF8() );
>                 break;
> 
> 
> but then you realize that fp_lib_table has quoted strings for MOST
> descr entries. And you question why the method is called
> NeedSYMBOLorNumber when its a string with quotes. Then you go deeper
> and realize the parser may as well be a csv reader/writer. You either
> parse like sexpressions or you stop calling it sexpressions.
> 
> 
> 
> New proposed system:
> 1. Reads and generates an in memory tree structure of data as it
> should be, i.e. lists, strings, numbers, etc.

I'm ok with the memory tree structure as long as it doesn't add any
significant parsing or conversion to internal object overhead.

> 2. Helpers to pull out each item as need be

OK

> 3. Backwards compatible

Backwards compatibility is not optional.

> 4. Doesn't do silly keywords micro optimization at compile time. You
> do a string comparison to convert the value to integer anyway, using
> if/else is no different each time. Kicad isn't parsing gigabyte sized
> files nor hundreds of files, this optimization really isn't worth the
> overhead in maintenance.

This change will probably kill your performance as the file format gets
new features which it will.  I could be wrong but there is not going to
be much faster token look up than integers.  Before you decide this
optimization isn't worth it, you need to get an 8 layer+ high density
board file from someone and do some testing.

> 5. Generate saved files from in memory tree structures, this will
> avoid all possible formatting irregularities and differences because
> someone handwrote unrolling all the data members.

I need to see how your doing this because keeping both the memory tree
and the board objects in memory doesn't make a lot of sense to me.

> 6. Avoid things like " ${KIGITHUB}/Air_Coils_SML_NEOSID.pretty" being
> defined as a symbol instead of a string in the future.
> 7. Explicit definitions of symbols and strings. Strings are always
> quoted. Period. No silly auto-detection logic.
> 
> 
> So my first goal is to have 1:1 parity with the existing stuff for
> kicad files for both reading and writing.
> 
> 
> Benchmarks:
> Old fp-lib-table read: 1ms
> New fp-lib-table read: 2ms
> 
> Here's the actual commit for fp-lib-table:
> https://github.com/marekr/kicad-sexpr/commit/9367f469be69962d14671411eddd6fd759ace1f2
> 
> Not expecting anyone to compile it or anything, more input that
> anything. Yes its messy as its an initial proof of concept.
> 
> 
> 
> Yes, string parsing and writing isn't escaping properly, TBD (easy, just lazy).
> There would probably be a smaller kicad_sexpr wrapper to implement
> common sexpr pattern helpers such as the "list key-value pair" that's
> used.
> 
> 
> 
> Also played around with .kicad-pcb but its not committed:
> Old read: ~350ms
> New read: ~230ms
> 
> 
> I await the abuse.
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~kicad-developers
> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~kicad-developers
> More help   : https://help.launchpad.net/ListHelp
> 



Follow ups

References