← Back to team overview

kicad-developers team mailing list archive

Re: Re: New library file format

 

Wayne Stambaugh wrote:
emmedics4 wrote:

--- In kicad-devel@xxxxxxxxxxxxxxx, Wayne Stambaugh <stambaughw@...> wrote:


Thank you Wayne, the documents are lied in kicad-doc directory. I also prefer text. Some folks here think that it would be good if the file is more readable by human. I would like to propose a new file format based on an existing tool such as JSON (wxJSON, http://wxcode.sourceforge.net/docs/wxjson/). I think this will make the file more readable, flexible and extensible.

I agree that the current library and schematic file formats are not very
human readable. I believe they are legacy file formats that have been
with the project for some time. I am not familiar with JSON so I cannot
comment on how effective it would be. One thing I think it should
support is wxInputStream and wxOutputStream ( or C++ library istream and
ostream ). This way you only have to write one parser and one output
format to support streams from any of the derived input and output
stream classes. If you us wxInputStream to write you parser, you get
socket, file, zip, and tar input streams for very little extra effort.
With all of the new library discussions, I could see where any one of
the derived input and/or output streams would be useful.

I agree STRONGLY on the need of wxInputStream/wxOutputStream support and should for me be the first step to do before think to change format library.


I actually started to replace the file based code with wxInputStream for
the component library before I realized that I had to do a lot of
cleanup to the component library objects before attempting implement
wxInputStream and wxOutputStream. I think I have the component library
object code to the point where implementing this will be reasonable
task. I still need to take a look at Dick's dsn parser and formatter
before I attack this problem again. I think the best way forward is to
leave the existing file based code alone for legacy file support and
create new load and save methods for the dsn file support. This should
make the transition from file formats a lot easier.

Wayne


The lexer is returning an integer token and this makes checking the grammar pretty easy and efficient using a recursive descent design. The lookup now from string token to integer token is done via a binary search, but there is a hashtable class in the boost stuff that could make this a little faster if we need it to be faster than fast. There would still need to be a table to go from int to string, and that can remain an array since the tokens are all sequential. So within the lexer, there is a provision now to lookup a token from a string, and the reverse, looking up a string from a token. The token to string translation is for regeneration of the token's text from within error reporting contexts when you want a generic way to get the text of any token, not just the current token. This is used extensively in the ELEM::Format() functions, so derived classes can use their base class's Format() functions in the serializing. ELEM::Format() is for serializing.

The LineReader abstraction is useful for accurate reporting of parsing errors. I find it essential to know the exact line number and character offset in my error messages. That way you can pull up the input text file in an editor and find the exact location of the problem.

There is a special CMake target you can build to run the parser and the outputter. When these are tied together they constitute a DSN beautifier, as well as a tester of the full grammar.

The CMake EXCLUDE_FROM_ALL flag means you have to ask for this target to be built explicitly.

$ make specctra_test

But before you do that, I think you have to edit specctra.cpp and uncomment this line temporarily:

//#define STANDALONE


STANDALONE can be used to conditionally exclude statements which might cause linker errors at least in that file. Its been a while since that target has been built, somebody may have broken it.

Once you have the program built, check out the main function at the bottom of specctra.cpp. With a recompile it can be told to parse either a *.SES file or a *.DSN file.

Then you will have to wrap your mind around the OutputFormatter classes that I wrote, (one to file, one to memory) and the OUTPUTFORMATTER::Print(), ELEM::Format(), and ELEM::FormatContents() functions. These are dream tools.

You can format (i.e. serialize) an entire object to memory, and then do a string compare on the result. This is a trick I use in the exporter to detect duplicate objects, even they are separate instances but hold the same contents. This is necessary because comparing several pad stacks at the binary level would drive you nuts.

Have fun. I think we'd have to promote some of the classes up to a common place in the source tree. And augment the Lexer with a token table assignment function. And you need a new class to hold your doXXXX() type functions. This can simply be a PARTREADER or something like that. But it is a pretty easy to adapt it. I just don't want to see my copyright messages get lost in this process, thanks.

BTW, the grammar for each object is often put into each doXXX() type function and that came from the specctra spec. But you would be inventing your own grammar.

Dick








Follow ups

References