zorba-coders team mailing list archive

Thread
Date

[Bug 924063] Re: Sentence is incorrectly incremented when token characters end without sentence terminator, take 2

To: zorba-coders@xxxxxxxxxxxxxxxxxxx
From: "Paul J. Lucas" <924063@xxxxxxxxxxxxxxxxxx>
Date: Tue, 31 Jan 2012 15:30:10 -0000
Reply-to: Bug 924063 <924063@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

It turns out that the original bug fixes were correct. It happens that
ICU uses more than just sentence terminating characters (like '.') to
know when a sentence ends: the first letter of the first word after the
'.' has to be capitalized. Hence the tests were wrong, e.g., "hello.
world". Once that test was changed to "Hello. World" it passed.

** Branch linked: lp:~paul-lucas/zorba/bug-924063

-- 
You received this bug notification because you are a member of Zorba
Coders, which is the registrant for Zorba.
https://bugs.launchpad.net/bugs/924063

Title:
  Sentence is incorrectly incremented when token characters end without
  sentence terminator, take 2

Status in Zorba - The XQuery Processor:
  In Progress

Bug description:
  The original bug (bug #863320) was fixed, but then it caused other
  tests to fail (bug #897800), so the fix was reverted so the release
  could be done. This new bug is to fix the original bug without causing
  any other tests to fail.

  The original bug was:

  The following query:

  let $x := <msg>hello world</msg>
  return $x contains text "hello" ftand "world" same sentence

  incorrectly returns "false" because tokenizer incorrectly increments
  the sentence number when there are no more characters without
  encountering a sentence terminating character.

To manage notifications about this bug go to:
https://bugs.launchpad.net/zorba/+bug/924063/+subscriptions

References

[Bug 924063] [NEW] Sentence is incorrectly incremented when token characters end without sentence terminator, take 2
From: Paul J. Lucas, 2012-01-31