← Back to team overview

zorba-coders team mailing list archive

[Bug 1022762] Re: Multi-char escapes wrongly forbidden in character class

 

Removing the "fix" code results in the regex_err16.xq test failing. That
test is:

  fn:matches("a", "[\s-e]")

The charClassExpr is invalid because, in character ranges, only
SingleCharEsc are allowed and \s is a MultiCharEsc. ICU doesn't detect
this and the test just returns "false."

Adding a proper fix for this would involve adding more state to the
regex parser and knowing when we're within a character class *and*
within a character range, i.e.:

  if ( in_char_class && c == '-' && prev_c_was_an_esc && !prev_c_was_a_single_char_esc )
    throw an exception

-- 
You received this bug notification because you are a member of Zorba
Coders, which is the registrant for Zorba.
https://bugs.launchpad.net/bugs/1022762

Title:
  Multi-char escapes wrongly forbidden in character class

Status in Zorba - The XQuery Processor:
  In Progress

Bug description:
  If you have a character range, e.g., A-Z, then the end-point chars in
  the range can be SingleCharEsc. A while ago, a "fix" was made for
  this, but the "fix" went too far and forbids MultiCharEsc within
  charClassExpr.

To manage notifications about this bug go to:
https://bugs.launchpad.net/zorba/+bug/1022762/+subscriptions


References