calibre-devs team mailing list archive
-
calibre-devs team
-
Mailing list archive
-
Message #00054
Unified conversion tool
Hi Kovid, etc:
I've been playing around with some ideas for creating a fully unified
conversion/generation tool -- something that's flexible enough to do
everything the current suite does, but using a modular infrastructure
allowing re-use (and re-arrangement) of parts in different conversion
pipelines.
For experimenting with CSS flattening, I've already extended my OEBBook
and backing "container" classes to further simplify content
manipulation. Any content transform reduces then to a callable which
accepts an OEBBook object and transforms it in-place. The remaining
bits to sort out are the command-line and Python programmatic
interfaces. Well, and the suite of transforms :-). But some more solid
ideas for the former gelled in my mind while sleeping.
For the command-line, I'm thinking something like this:
oebtool [OPTIONS] [PIPELINE] [TRANSFORMS] INFILE OUTFILE
Where oebtool is a terrible name. The basic idea is that it converts
INFILE to OUTFILE, either automatically deriving the type of each from
their filenames or with the options -i/--input-format and
-o/--output-format. A PIPELINE is a pre-canned set of transforms. I'm
torn on whether or not the PIPELINE should consume any options not
understood by `oebtool' itself, manipulating them and passing them on to
the individual transforms as necessary. Each command-line TRANSFORM is
specified with a -t/--transform option, and can accept sub-options using
a syntax I'm partially stealing from mplayer:
oebtool ... -t TRANSFORM[:[OPTION=]VALUE[:[OPTION=]VALUE...]] ...
And the input/output format objects could accept arguments in the same
way -- they're really just a special kind of transform, afterall.
So a complete command-line could look something like:
oebtool clean \
-t fonts:serif="Adobe Calson Pro" \
-t margins:left=10pt:right=10pt:top=12pt:bottom=12pt \
book.lit -o oeb:version=1.2 book/
The programmatic interface could be pretty simple. Each transform and
pipeline could exists as a Python module exposing a particular
interface, which probably need consist only of callable, an option
parser, and a docstring. The option parsers can hopefully be derived
from optparse.OptionParser -- that would certainly simplify things quite
a bit.
When I next find a bit of time, I'll probably push a branch up to
launchpad to play around with all this.
Comments, suggestions, etc?
-Marshall
Follow ups