kicad-developers team mailing list archive
-
kicad-developers team
-
Mailing list archive
-
Message #03822
Re: Re: We should decide a quoting convention...
Re: DSNLEXER
OK, I spent the last several hours improving DSNLEXER.
My last posting stands. Keywords must be ASCII, not UTF8. These are
tokens which are known ahead of time, so there's no reason to stray from
ASCII, and it is silly trying to sort UTF8 strings. Keywords are sorted
ahead of time. UTF8 must be quoted and is where you would put app
specific identifiers.
As far as my A) is concerned, DSNLEXER now handles quoted strings
properly, you simply have to double up the quote character if it is
embedded in a quoted string. The quote_char is dynamically assignable
per the Specctra DSN spec to one of the 3 chars: ' " or $, but this is
extraneous information.
DSNLEXER also handles multi-line strings, and these must be quoted.
(This is a multi-line string from the perspective of the lexer, who will
return it as a single token.)
You could still run a single line string through the lexer that contains
some other newline delimiter such as \n, but this is of no concern to
the lexer.
So the grammar designer can handle newlines any way he/she wants to,
either as:
1) true newlines embedded in a quoted string or
2) as \n or \x9, or
3) %9
4) other, say embedding a number of newline-less quoted strings in a ()
element.
the DSNLEXER does not care, and therefore neither do I at this point.
We are now covered, and I am quite hopeful that I did not break the
compatibility of DSNLEXER with the Specctra DSN syntax. We now support
a super-set of the original syntax, in that DSNLEXER strips out the
doubled-up quotes for the parser, handles multi-lines in quoted strings,
and can optionally return comments to the parser. A comment string is a
line of text in the stream whose first non-blank character is a #. The
entire line of text is returned as the token, so that you can preserve
leading whitespace should you want to preserve these comments in some
later output that is respectful of indentation.
This is now a fairly solid platform to start designing grammars on top
of. The syntax issues I think are good enough for anything that I can
foresee. (floats are OK, as long as they do not contain an exponent.
You want exponents, then a little more work is required.)
I will be adding some more support than what already exists for
generating quoted strings, so that the doubled-up quoting is done for
you in OUTPUTFORMATTER.
-------------
So I would like to shift the discussion to grammar, and away from this
syntax churn / noise. When I say grammar, I am speaking of a sequence
of tokens & keywords which are returned from the DSNLEXER. Think higher
level now.
If anybody has some improvements on my footprint file grammar first
draft, then these need to be offered up by the time I get around to
coding it this spring, April or so. The footprint file grammar is
flexible, the syntax is probably firmed up IMO.
If you want to contribute, go back and find my (footprint ... ) example.
Dick
PS
I'll commit the DSNLEXER changes in the morning.
Follow ups
References