kicad-developers team mailing list archive

Thread
Date

Re: [RFC] Symbol library file format

To: Wayne Stambaugh <stambaughw@xxxxxxxxx>
From: Andrew Lutsenko <anlutsenko@xxxxxxxxx>
Date: Tue, 8 Jan 2019 23:11:03 -0800
Cc: KiCad developers <kicad-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CADn3vW0KFrKZE2i55p6Bn22waw=uiFvYCZXxdYtTr5SBRLxedA@mail.gmail.com>

Hi again,

I just pushed an example of some proto definitions and a code that shows
how to write it.
Take a look at https://github.com/qu1ck/kicad-source-mirror/tree/proto/proto

I modeled symbol.proto after what is currently in the eeschema format
proposal google doc but it's not completely 1:1 because in some cases
reusing existing structure made sense instead of adding more restricted one
and in some cases I just don't understand current proposal. (What's up with
the "part" keyword, it's mentioned as top level keyword but then in the
example there are multiple? I don't quite understand the inheritance).
But these are all nuances we can work out later.

I am not sure I setup cmake config correctly but it works and dependencies
seem to be picked up properly.
To get it working all you have to do is install protobuf library. On msys
it's as simple as
pacman -S mingw-w64-protobuf
on *nix it's best to compile it from source, which I tested on debian and
worked like a charm:
https://github.com/protocolbuffers/protobuf/blob/master/src/README.md

then just reconfigure cmake to pick up new targets and proto files and
build proto_test target to play around with it.

Wayne let me know if this is at all interesting. If you are still concerned
about performance I should be able to do a benchmark now that basic
scaffolding is in place.

Regards,
Andrew

On Sat, Jan 5, 2019 at 4:46 PM Andrew Lutsenko <anlutsenko@xxxxxxxxx> wrote:

> With the amount of data that KiCad handles performance of particular
> parser has minuscule impact. File IO delays would thwart any processing
> time.
> What Simon talked about is as he said a separate discussion and is
> orthogonal to the decision of how to define data model and what file format
> to use. Also I don't agree that there is a dichotomy of "save full memory
> state" or "save only bare protocol data". There is lots of room for
> compromise on what things on top of bare data you want to save to disk to
> speed up restoring full memory state on load. Spatial indexes, if you
> choose to save them, would have to be described in terms of another
> structure in the IDL, it's not generator's responsibility to understand
> what the index represents. But anyway - that's a separate discussion.
>
> I understand that you are cautious to bring in new dependency that most
> devs are not familiar with. Is there anything I can do to break the ice?
> I was planning to translate the current schematic file format proposal
> into proto definitions to show how easy they are to use. I could also try
> to add a cmake target that would compile those proto definitions in ready
> to use .h and .cpp files. Will you reconsider if I show these examples?
>
> Do you have other concerns about requirements that are potentially not met
> aside from performance? I can try to answer any questions you have.
> If perf is the only one I could also possibly write some sort of benchmark
> comparing parsing performance of KiCad's current s-expr parser and
> protobufs. But that is not a small commitment so I don't want to do it if I
> can't change your mind.
>
> Regards,
> Andrew
>
>
> On Sat, Jan 5, 2019 at 5:17 AM Wayne Stambaugh <stambaughw@xxxxxxxxx>
> wrote:
>
>> On 1/3/19 11:24 PM, Simon Richter wrote:
>> > Hi,
>> >
>> > On 03.01.19 19:06, José Ignacio wrote:
>> >
>> >> I
>> >> think useful comments to the proposed format should see beyond the
>> >> actual low level representation of the data and talk about the overall
>> >> model being used to store it.
>> >
>> > tl;dr: That's a separate discussion.
>> >
>> > There are two schools of thought here, one that treats saved data as a
>> > protocol between two black box instances, and one that treats it as a
>> > serialization of the internal state.
>> >
>> > Both have advantages and disadvantages. The "protocol" approach allows
>> > changing internals more easily, and gives better compatibility between
>> > versions as changes to the file format have to be made deliberately,
>> > while the "serialization" approach gives us load/save basically for
>> > free, so we need a lot less code.
>> >
>> > The "serialization" model also requires us to generate the internal data
>> > structures from a more constrained language like IDL, as the marshaller
>> > needs to know when to follow pointers, and what members of an array are
>> > actually valid.
>> >
>> > I'm not sure there are generators that include support for spatial
>> > indexes, though, which is pretty much a requirement for fast rendering,
>> > so this is pretty much impossible at the moment, which places us in
>> > "hand written load/save code" territory anyway.
>> >
>> > We have also ignored diff/merge capability so far, which I believe is a
>> > good thing because it cannot really be done on a textual level (schemas
>> > are two-dimensional, PCBs are three-dimensional, so there is no normal
>> > form with a consistent ordering of elements that will make the files
>> > diffable).
>> >
>> > I fully expect both the internal model and the file format to change
>> > significantly in the coming years as new features are added. The main
>> > requirement for the file format is that it always needs to be possible
>> > to read older files in some way, and to recognize when a file is newer
>> > than the current parser understands.
>> >
>> > It might be a good idea to also have an "extension" mechanism, where we
>> > don't increase the version number when adding a new feature, but rather
>> > mark files that actually use the new feature, so files written by newer
>> > versions that don't use one of the newer functions can be read by older
>> > versions. This would also be another point for "hand written" load/save.
>> >
>> >    Simon
>> >
>>
>> I've done some more investigation on this and I'm not convinced that
>> this is any better than our current implementation due to some of the
>> issues Simon mentioned above.  I've also discovered there is also a
>> significant performance hit[1] with the JSON reader.  This is important
>> due to the overhead required when loading libraries.  I appreciate the
>> information and protobufs are interesting but I am going to stick with
>> what we have and know versus what we don't know because I don't think we
>> have the extra manpower to experiment only to find out it doesn't suite
>> our requirements.
>>
>> Cheers,
>>
>> Wayne
>>
>> [1]:
>>
>> https://groups.google.com/forum/#!searchin/protobuf/JSON|sort:date/protobuf/23Slq4AX7oE/xuwQX_ZXAAAJ
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~kicad-developers
>> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~kicad-developers
>> More help   : https://help.launchpad.net/ListHelp
>>
>

References

[RFC] Symbol library file format
From: Wayne Stambaugh, 2019-01-01
Re: [RFC] Symbol library file format
From: Andrew Lutsenko, 2019-01-02
Re: [RFC] Symbol library file format
From: kristoffer Ödmark, 2019-01-02
Re: [RFC] Symbol library file format
From: Wayne Stambaugh, 2019-01-02
Re: [RFC] Symbol library file format
From: Andrew Lutsenko, 2019-01-03
Re: [RFC] Symbol library file format
From: Martijn Kuipers, 2019-01-03
Re: [RFC] Symbol library file format
From: Andrew Lutsenko, 2019-01-03
Re: [RFC] Symbol library file format
From: José Ignacio, 2019-01-03
Re: [RFC] Symbol library file format
From: Simon Richter, 2019-01-04
Re: [RFC] Symbol library file format
From: Wayne Stambaugh, 2019-01-05
Re: [RFC] Symbol library file format
From: Andrew Lutsenko, 2019-01-06