← Back to team overview

dx-packages team mailing list archive

[Bug 885600] Re: DeeTextAnalyzer feature checklist

 

** Changed in: unity (Ubuntu)
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of DX
Packages, which is subscribed to unity in Ubuntu.
Matching subscriptions: dx-packages
https://bugs.launchpad.net/bugs/885600

Title:
  DeeTextAnalyzer feature checklist

Status in Dee:
  Triaged
Status in Unity:
  Triaged
Status in “dee” package in Ubuntu:
  Triaged
Status in “unity” package in Ubuntu:
  Confirmed

Bug description:
  This is a tracker bug to help me remember which features I want in
  DeeTextAnalyzer:

   - Detect numeric sub sequences. Fx "Foo125" -> "foo", "125"
   - Split on "CamelCase" -> "camel", "case"
   - Detect and create CJK n-grams (and tokenize CJK subsequences when embedded in non-CJK text)

To manage notifications about this bug go to:
https://bugs.launchpad.net/dee/+bug/885600/+subscriptions