kicad-developers team mailing list archive
Mailing list archive
Re: We should decide a quoting convention...
Tue, 22 Dec 2009 22:46:57 -0000
--- In kicad-devel@xxxxxxxxxxxxxxx, Manveru <manveru@...> wrote:
> > Why not just escape white spaces and parentheses (by \x20 or %20)? Also
> > UTF-8 multibyte sequences do not interfere with any control characters,so
> > no need to enclose them.
> IMVHO such method greatly complicates parsing by any outside tool. It would
> be nice to have file format self-descriptive.
(Un)escaping is done by simple pattern replace procedure. Ater that none ofdelimiter characters appear in field values.
I case of quoting parser is more complicated because tokenizer needs to deal not only with token separators (spaces, LFs, parentheses), but also with quotes and escape codes to distinguish is separator character really separator or a part of string (thus doing unescaper's work).
You will get complex procedure instead of two simple(tokenize + unescape). I think two simple are better than one complex because of code factoring.
Also many programming languages have ready split and replace procedures, soone can use them easily to parse file. But most of them cannot deal with quoted strings, so every script writer should write your own or use heavyweight parsers.
Futhermore I dislike idea with parentheses for the same reason.
Currently working on KiCAD font I use AWK for parsing eeschema library, which used to store glyphs (and edit them using eeschema library editor). I choose this way because plain library format is very easy to parse (most of work, splitting to lines and fields, is done by AWK automatically). I wonderwhat whould I do if I have to parse all those quotes and parentheses. Current DEF...ENDDEF is much better than (def...) in this case.
Most of tools do not do anything with text strings. In this case parsing ofquoted strings is vain, but still required.
> Quotes are useful in terms of self-descriptive format.
For whom? I think any advanced user (if he have a guts to get into schematic file) knows about escape sequences. So using of quotes is not more self-descriptive than escaped. Maybe you mean "natural"? That's not a problem we should think about.
> Remember about ignoring UTF markers at the beginning of the file (added by
> some windows apps, not added by most linux apps) - otherwise any user
> editing the file in notepad will loose his work.
Oh yes, that's the single code sequence we should just ignore. =) But thanks on remind! That's it: EF BB BF.