c2c-oerpscenario team mailing list archive
-
c2c-oerpscenario team
-
Mailing list archive
-
Message #25040
[Bug 787908] [NEW] sxw2rml cannot support for Simplified Chinese Version OpenOffice 1.0 document
Public bug reported:
I used the following python script convert sxw(OpenOffice 1.0 document) to rml document.
<pre>
import zipfile,sys
from pyopenoffice import PyOpenOffice
import StringIO
from lxml import etree
import xml.dom.minidom
import libxslt
import libxml2
fname = r'c:\test.sxw'
xsl_file = './normalized_oo2rml.xsl'
z = zipfile.ZipFile(fname, 'r')
mimetype = z.read('mimetype')
if mimetype.split('/')[-1] == 'vnd.oasis.opendocument.text' :
xsl_file = './normalized_odt2rml.xsl'
xsl = file(xsl_file).read()
tool = PyOpenOffice('.', save_pict = False)
sxw_file = fname
res = tool.unpackNormalize(sxw_file)
styledoc = libxml2.parseDoc(xsl)
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseMemory(res,len(res))
result = style.applyStylesheet(doc, None)
print result
</pre>
There are some bug of minidom python extended, and I fixed it.
@tiny_sxw2rml.pdf (5.x) or @openerp_sxw2rml.pdf I found the code...
<pre>
styles_styles = self.styles_dom.getElementsByTagName("style:style")
</pre>
I fixed it like :
<pre>
....
styles_styles = []
styles_styles = styles_styles + self.styles_dom.getElementsByTagName("style:style")
styles_styles = styles_styles + self.styles_dom.getElementsByTagName("style:font-decl")
....
</pre>
and some trouble with "content_styles" variable...
@normalized_oo2rml.xsl document. I found the code:
<pre>
<xsl:when test="not($fontName='') and boolean($fontName)">
....
<xsl:when test="contains($fontName,'Courier')">
...
<xsl:when test="contains($fontName,'Helvetica') or contains($fontName,'Arial') or contains($fontName,'Sans')">
...
<xsl:otherwise> <-------------------- Otherwise 1
...
<xsl:otherwise> <-------------------- Otherwise 2
...
</pre>
In Simplified Chinese Version OpenOffice 1.0 document, The "fontName" is "宋体", "黑体".
I found in my "test.sxw" file, the normalized_oo2rml.xsl match the "Otherwise 2", the sxw file's "宋体" fontName be replaced with "Times-Roman"..
Then, How to fixed it and add docini/registerFont node to generated rml
file. order to let OpenERP to support the Simplified Chinese Version
OpenOffice 1.0 document can be convert to rml file.
Thanks...
mrshelly
2011/05/25
** Affects: openobject-server
Importance: Undecided
Status: New
** Tags: fontname mrshelly report rml sxw2rml
--
You received this bug notification because you are a member of C2C
OERPScenario, which is subscribed to the OpenERP Project Group.
https://bugs.launchpad.net/bugs/787908
Title:
sxw2rml cannot support for Simplified Chinese Version OpenOffice 1.0
document
Status in OpenERP Server:
New
Bug description:
I used the following python script convert sxw(OpenOffice 1.0 document) to rml document.
<pre>
import zipfile,sys
from pyopenoffice import PyOpenOffice
import StringIO
from lxml import etree
import xml.dom.minidom
import libxslt
import libxml2
fname = r'c:\test.sxw'
xsl_file = './normalized_oo2rml.xsl'
z = zipfile.ZipFile(fname, 'r')
mimetype = z.read('mimetype')
if mimetype.split('/')[-1] == 'vnd.oasis.opendocument.text' :
xsl_file = './normalized_odt2rml.xsl'
xsl = file(xsl_file).read()
tool = PyOpenOffice('.', save_pict = False)
sxw_file = fname
res = tool.unpackNormalize(sxw_file)
styledoc = libxml2.parseDoc(xsl)
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseMemory(res,len(res))
result = style.applyStylesheet(doc, None)
print result
</pre>
There are some bug of minidom python extended, and I fixed it.
@tiny_sxw2rml.pdf (5.x) or @openerp_sxw2rml.pdf I found the
code...
<pre>
styles_styles = self.styles_dom.getElementsByTagName("style:style")
</pre>
I fixed it like :
<pre>
....
styles_styles = []
styles_styles = styles_styles + self.styles_dom.getElementsByTagName("style:style")
styles_styles = styles_styles + self.styles_dom.getElementsByTagName("style:font-decl")
....
</pre>
and some trouble with "content_styles" variable...
@normalized_oo2rml.xsl document. I found the code:
<pre>
<xsl:when test="not($fontName='') and boolean($fontName)">
....
<xsl:when test="contains($fontName,'Courier')">
...
<xsl:when test="contains($fontName,'Helvetica') or contains($fontName,'Arial') or contains($fontName,'Sans')">
...
<xsl:otherwise> <-------------------- Otherwise 1
...
<xsl:otherwise> <-------------------- Otherwise 2
...
</pre>
In Simplified Chinese Version OpenOffice 1.0 document, The "fontName" is "宋体", "黑体".
I found in my "test.sxw" file, the normalized_oo2rml.xsl match the "Otherwise 2", the sxw file's "宋体" fontName be replaced with "Times-Roman"..
Then, How to fixed it and add docini/registerFont node to generated
rml file. order to let OpenERP to support the Simplified Chinese
Version OpenOffice 1.0 document can be convert to rml file.
Thanks...
mrshelly
2011/05/25
Follow ups
References