kicad-developers team mailing list archive
-
kicad-developers team
-
Mailing list archive
-
Message #22047
Re: [rfc] actual sexpression parsing
On Fri, Dec 18, 2015 at 07:55:07AM -0500, Mark Roszko wrote:
> Writing a new state machine for every single list and every single
> file over and over again is the part I have problems with. There
> should be a single state machine that takes the tokens and gives you a
> list. Not 500 over the whole codebase.
Also, re-read eventually the part about parser generation. And think
about grammar changes...
> The definition of sanity is not splitting it into many data forms.
It's nonetheless a 'curious' engineering approach:D
> >3) Using an XML similitude, usual sexp processing in lisp follows
> something like a DOM model
>
> Yea that was the plan when I structured my end result. Walking it
> later is trivial.
I'd suggest to use a proper list/vector container instead of the cons
approach (it was meant to be a joke). Cons handling is trickier without
the lisp runtime at hand :D
In pseudo-BNF
list :- sequence-of list-element
list-element :- one-of(symbol, string, number, whatever, list)
The sequence-of could be a vector of base pointers using push_back, the
one-of is obviously modeled with inheritance (if it were C a union would
be fine...). As for the lexing strategy: the traditional lisp reader has
*no* lookahead and dispatch on the first character:
- '(' starts a list
- [0123456789.+-] starts a number
- '"' starts a string
- a letter start a symbol
- whitespace is eaten
- other characters trigger specific behaviour (like the '#' main macro
character)
*if* you want to keep string quoting optional then you can't distinguish
a string from a symbol (because depends on the semantic grammar which
the reader doesn't have access to). Then you have to match keywords as
string, not elegant but doable.
> I'm more for manual walking of the lists after the fact than trying to
> use an event based one. I don't see a benefit really and rather see it
> increase complexity with needing callback classes when manual
> unrolling should work fairly well BUT i am not exactly happy with
> manual unrolling looks so its something to play with.
Given the relatively low amount of data to process a DOM approach is
quite feasible. Keep an iterator on the current list handy and loop
away. There are plenty of matching/binding/unifying/destructuring
methods to use when you have the whole list already in core. Personally
I would use a recursive descent driven by the tree elements (*not*
directly by the input file, as it is now); it should be the easiest to
do by hand.
--
Lorenzo Marcantonio
CZ Srl - Parma
References