← Back to team overview

dhis2-devs team mailing list archive

Re: [Bug 1438624] Re: XML dxf2 import: Use of ampersand (&) causes crash

 

Hi Calle,

There are only 5 in XML.
"   "
'   '
<   &lt;
>   &gt;
&   &amp;

Special characters (e.g. ï) should always be in UTF-8 format. Otherwise,
there may be issues. Often times these characters are not typed properly,
and those cannot be directly rendered as an escaped UTF-8 sequence in XML.

Regards,
Jason







On Tue, Mar 31, 2015 at 1:55 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
wrote:

> Hi
>
> OK, noted - but the list of illegal characters should then be highlighted
> in the manual.
>
> I recall now that I've come across this before with other special French
> umlaut type characters, actually...
>
> Do you have a complete lookup table with the invalid character and the
> equivalent escape sequence available, that we can use to scan the exported
> text?
>
> Regards
> Calle
>
> On 31 March 2015 at 13:42, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>
>> Lars is right.  Any xml processor is going to choke on those
>> characters and there is nothing one could do in DHIS2 code to prevent
>> it.  Basically its just not valid xml.
>>
>> On 31 March 2015 at 12:05, Lars Helge Øverland <larshelge@xxxxxxxxx>
>> wrote:
>> > Hi Calle,
>> >
>> > this is not really a DHIS 2 bug.  The XML spec clearly defines that
>> > special characters such as < and & must be escaped into &amp; and &lt; .
>> > When you export such values from DHIS 2 such characters are properly
>> > escaped.
>> >
>> >
>> > regards,
>> >
>> > Lars
>> >
>> >
>> >
>> > ** Changed in: dhis2
>> >        Status: New => Invalid
>> >
>> > --
>> > You received this bug notification because you are a member of DHIS 2
>> > developers, which is subscribed to DHIS.
>> > https://bugs.launchpad.net/bugs/1438624
>> >
>> > Title:
>> >   XML dxf2 import: Use of ampersand (&) causes crash
>> >
>> > Status in DHIS 2:
>> >   Invalid
>> >
>> > Bug description:
>> >   When importing a dxf2 (XML) data file that includes one or more
>> >   ampersands in e.g. the comment field, it causes a crash. See the
>> >   tomcat log below - the import process is obviously unable to handle
>> >   the ampersand (code 32) character - and possibly a number of other
>> >   characters too.
>> >
>> >   None of this is mentioned in the manuals, though, where the "comment"
>> >   field is only referred to as a "free text" field with no further
>> >   restrictions.
>> >
>> >   There are two solutions here:
>> >   1. The import process is modified to handle "unexpected" characters
>> automatically
>> >   2. A list of "unexpected" characters are published as prohibited (and
>> they should be disallowed within DHIS2 also, then) with a more
>> user-friendly error message when they are encountered.
>> >
>> >   Regards
>> >   Calle
>> >
>> >   * ERROR 2015-03-31 12:00:04,003 org.amplecode.staxwax.XMLException:
>> Failed to move to start element
>> >           at
>> org.amplecode.staxwax.reader.DefaultXMLStreamReader.moveToStartElement(DefaultXMLStreamReader.java:130)
>> >           at
>> org.hisp.dhis.dxf2.datavalueset.StreamingDataValueSet.hasNextDataValue(StreamingDataValueSet.java:136)
>> >           at
>> org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSet(DefaultDataValueSetService.ja
>> >           at
>> org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSet(DefaultDataValueSetService.ja
>> >           at
>> org.hisp.dhis.importexport.action.util.ImportDataValueTask.run(ImportDataValueTask.java:91)
>> >           at
>> org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnabl
>> >           at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> >           at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> >           at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecut
>> >           at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java
>> >           at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >           at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >           at java.lang.Thread.run(Thread.java:745)
>> >   Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected
>> character ' ' (code 32) (missing name?)
>> >    at [row,col {unknown-source}]: [15,162]
>> >           at
>> com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:647)
>> >           at
>> com.ctc.wstx.sr.StreamScanner.parseFullName(StreamScanner.java:1931)
>> >           at
>> com.ctc.wstx.sr.StreamScanner.parseEntityName(StreamScanner.java:2057)
>> >           at
>> com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1525)
>> >           at
>> com.ctc.wstx.sr.BasicStreamReader.parseAttrValue(BasicStreamReader.java:1938)
>> >           at
>> com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3065)
>> >           at
>> com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2963)
>> >           at
>> com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2839)
>> >           at
>> com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1073)
>> >           at
>> org.amplecode.staxwax.reader.DefaultXMLStreamReader.moveToStartElement(DefaultXMLStreamReader.java:113)
>> >
>> > To manage notifications about this bug go to:
>> > https://bugs.launchpad.net/dhis2/+bug/1438624/+subscriptions
>> >
>> > _______________________________________________
>> > Mailing list: https://launchpad.net/~dhis2-devs
>> > Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> > Unsubscribe : https://launchpad.net/~dhis2-devs
>> > More help   : https://help.launchpad.net/ListHelp
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
>
> --
>
> *******************************************
>
> Calle Hedberg
>
> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>
> Tel/fax (home): +27-21-685-6472
>
> Cell: +27-82-853-5352
>
> Iridium SatPhone: +8816-315-19274
>
> Email: calle.hedberg@xxxxxxxxx
>
> Skype: calle_hedberg
>
> *******************************************
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+46764147049

References