kicad-developers team mailing list archive
Mailing list archive
Re: Re: New library file format
Dick Hollenbeck <dick@...>
Thu, 12 Nov 2009 02:31:18 -0600
Thunderbird 22.214.171.124 (X11/20090817)
Wayne Stambaugh wrote:
--- In kicad-devel@xxxxxxxxxxxxxxx, Wayne Stambaugh <stambaughw@...> wrote:
Thank you Wayne, the documents are lied in kicad-doc directory. I also prefer text. Some folks here think that it would be good if the file is more readable by human. I would like to propose a new file format based on an existing tool such as JSON (wxJSON, http://wxcode.sourceforge.net/docs/wxjson/). I think this will make the file more readable, flexible and extensible.
I agree that the current library and schematic file formats are not very
human readable. I believe they are legacy file formats that have been
with the project for some time. I am not familiar with JSON so I cannot
comment on how effective it would be. One thing I think it should
support is wxInputStream and wxOutputStream ( or C++ library istream and
ostream ). This way you only have to write one parser and one output
format to support streams from any of the derived input and output
stream classes. If you us wxInputStream to write you parser, you get
socket, file, zip, and tar input streams for very little extra effort.
With all of the new library discussions, I could see where any one of
the derived input and/or output streams would be useful.
I agree STRONGLY on the need of wxInputStream/wxOutputStream support and should for me be the first step to do before think to change format library.
I actually started to replace the file based code with wxInputStream for
the component library before I realized that I had to do a lot of
cleanup to the component library objects before attempting implement
wxInputStream and wxOutputStream. I think I have the component library
object code to the point where implementing this will be reasonable
task. I still need to take a look at Dick's dsn parser and formatter
before I attack this problem again. I think the best way forward is to
leave the existing file based code alone for legacy file support and
create new load and save methods for the dsn file support. This should
make the transition from file formats a lot easier.
The lexer is returning an integer token and this makes checking the
grammar pretty easy and efficient using a recursive descent design. The
lookup now from string token to integer token is done via a binary
search, but there is a hashtable class in the boost stuff that could
make this a little faster if we need it to be faster than fast. There
would still need to be a table to go from int to string, and that can
remain an array since the tokens are all sequential. So within the
lexer, there is a provision now to lookup a token from a string, and the
reverse, looking up a string from a token. The token to string
translation is for regeneration of the token's text from within error
reporting contexts when you want a generic way to get the text of any
token, not just the current token. This is used extensively in the
ELEM::Format() functions, so derived classes can use their base class's
Format() functions in the serializing. ELEM::Format() is for serializing.
The LineReader abstraction is useful for accurate reporting of parsing
errors. I find it essential to know the exact line number and character
offset in my error messages. That way you can pull up the input text
file in an editor and find the exact location of the problem.
There is a special CMake target you can build to run the parser and the
outputter. When these are tied together they constitute a DSN
beautifier, as well as a tester of the full grammar.
The CMake EXCLUDE_FROM_ALL flag means you have to ask for this target to
be built explicitly.
$ make specctra_test
But before you do that, I think you have to edit specctra.cpp and
uncomment this line temporarily:
STANDALONE can be used to conditionally exclude statements which might
cause linker errors at least in that file. Its been a while since that
target has been built, somebody may have broken it.
Once you have the program built, check out the main function at the
bottom of specctra.cpp. With a recompile it can be told to parse either
a *.SES file or a *.DSN file.
Then you will have to wrap your mind around the OutputFormatter classes
that I wrote, (one to file, one to memory) and the
OUTPUTFORMATTER::Print(), ELEM::Format(), and ELEM::FormatContents()
functions. These are dream tools.
You can format (i.e. serialize) an entire object to memory, and then do
a string compare on the result. This is a trick I use in the exporter
to detect duplicate objects, even they are separate instances but hold
the same contents. This is necessary because comparing several pad
stacks at the binary level would drive you nuts.
Have fun. I think we'd have to promote some of the classes up to a
common place in the source tree. And augment the Lexer with a token
table assignment function. And you need a new class to hold your
doXXXX() type functions. This can simply be a PARTREADER or something
like that. But it is a pretty easy to adapt it.
I just don't want to see my copyright messages get lost in this process,
BTW, the grammar for each object is often put into each doXXX() type
function and that came from the specctra spec. But you would be
inventing your own grammar.