← Back to team overview

calibre-devs team mailing list archive

Re: Branch lp:~llasram/calibre/oeb2lit

 

I get a "not a branch" error when trying to check out the code. And on 
launchpad it says "This branch has not been pushed to yet"

I have no problems with exposing function pointers to ctypes in principle, but 
will that technique be portable across compilers?

Since I cant see the code, I cant comment on the URL normalization. 

Why are you using strip_space? To prettify the HTML?

Kovid. 

On Tuesday 09 December 2008 09:04:27 Marshall T. Vandegrift wrote:
> Kovid etc.,
>
> I've pushed the current state of my oeb2lit code to a new launchpad
> branch at lp:~llasram/calibre/oeb2lit.  I don't think it's /quite/ ready
> to merge with the trunk, but the basic functionality is implemented and
> integrated.  Issues for discussion:
>
>   - The anchor-hashing algorithm is still not yet known.  Without it
>     links into individual HTML streams with more than 6 anchors do not
>     work.
>
>   - I integrated the LZX compression code by the somewhat unorthodox
>     method of exposing the function addresses at Python `long's then
>     binding them in Python with the ctypes FFI interface.  This seems
>     reasonable to me, and greatly simplifies providing an OO interface
>     to the decompression capabilities, but one way or another the
>     compression and decompression code should be brought into parity.
>
>   - I modified the LitReader to normalize URI encoding in extracted
>     markup.  This isn't immediately relavant for LIT-generation, but I
>     did it for parity with the normalization I do on oeb2lit input.
>     This makes extracted mark-up more technically correct, but is a
>     change.
>
>   - LIT-to-LIT round-tripping does not currently work without whitespace
>     corruption.  The issue is that in LIT files -- contrary to normal
>     HTML rules -- all whitespace is considered relevant.  To help strip
>     unnecessary whitespace I'm using an lxml parser with
>     strip_space=True.  Unfortunately, this occasionally strips relevant
>     whitespace from LIT-extracted markup -- oops!  I've got a few ideas
>     but haven't had a chance to play around with them yet.
>
> So there you go.  Please let me know if you have an comments so far.
>
> -Marshall
>
> _______________________________________________
> Mailing list: https://launchpad.net/~calibre-devs
> Post to     : calibre-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~calibre-devs
> More help   : https://help.launchpad.net/ListHelp
>
> !DSPAM:3,493ea52575729411953765!

-- 
_____________________________________

Kovid Goyal  MC 452-48
California Institute of Technology
1200 E California Blvd
Pasadena, CA 91125

cell  : +01 626 390 8699
office: +01 626 395 6595 (449 Lauritsen)
email : kovid@xxxxxxxxxxxxxxxxxx
web   : http://www.kovidgoyal.net
_____________________________________




Follow ups

References