zim-wiki team mailing list archive
-
zim-wiki team
-
Mailing list archive
-
Message #02303
Re: Migration from OneNote to Zim
On Fri, Mar 15, 2013 at 8:12 AM, Jaap Karssenberg
<jaap.karssenberg@xxxxxxxxx> wrote:
> On Thu, Mar 14, 2013 at 5:39 PM, Michael Spranger
> <mikeitsecurity@xxxxxxxxx> wrote:
>> How much effort would it take to get that self contained HTML to import into
>> zim? I am not a scripter so I am of no help there.
>
> I got some code to unpack the stand alone HTML, that part is easy.
> Next step will be converting the HTML to text while preserving at
> least images and bullet lists. Some other markup can be preserved, but
> most may get lost. Tables will end up as lines of text.
>
> One limitation I see at the moment for the OneNote importer is that
> when I export a section from OneNote I get multiple pages in a single
> HTML file. Unfortunately the start of a new page is not clearly marked
> in the HTML, so splitting up in multiple pages will not be very
> robust.
OK, I also found some code I hacked some time ago to import fragments
of HTML. Will have to put the two together to get a real solution.
What I need at this point to proceed is some test data:
* .mht export of a notebook section containing multiple pages
* include some images
* include some bullet lists
* include headings and sub-headings (level 1 / 2 /.. )
* use bold / italic / ...
* include some bullet lists
Please make sure that such test data is not private and copyright
free, so I can add it to zim's test suite eventually. Try make it look
like realistic notes, that makes it easier to check if result looks
good as well. (So far I have been using an export of OneNote's welcome
pages, good example data but all copyrighted by Microsoft.)
Given good test data I can probably have a working import function in
a week or two.
Regards,
Jaap
Follow ups
References