← Back to team overview

zeitgeist team mailing list archive

[Bug 843668] [NEW] Blowing Xapian max term length corrupts index

 

Public bug reported:

Xapian has a (not very well documented) max term length of 245 bytes.
See fx. http://xapian.org/docs/omega/termprefixes.html. For some reason
this is not always gracefully handled inside Xapian and busting that
limit may occasionally corrupt the index.

This is reproducible by indexing long URLs (at least 245 bytes long). We
already had a cap at 2000 characters, but that was apparently not good
enough.

** Affects: zeitgeist-extensions
     Importance: High
     Assignee: Mikkel Kamstrup Erlandsen (kamstrup)
         Status: Triaged

** Changed in: zeitgeist-extensions
   Importance: Undecided => High

** Changed in: zeitgeist-extensions
       Status: New => Triaged

** Changed in: zeitgeist-extensions
     Assignee: (unassigned) => Mikkel Kamstrup Erlandsen (kamstrup)

-- 
You received this bug notification because you are a member of Zeitgeist
Extensions, which is the registrant for Zeitgeist Extensions.
https://bugs.launchpad.net/bugs/843668

Title:
  Blowing Xapian max term length corrupts index

Status in Zeitgeist Extensions:
  Triaged

Bug description:
  Xapian has a (not very well documented) max term length of 245 bytes.
  See fx. http://xapian.org/docs/omega/termprefixes.html. For some
  reason this is not always gracefully handled inside Xapian and busting
  that limit may occasionally corrupt the index.

  This is reproducible by indexing long URLs (at least 245 bytes long).
  We already had a cap at 2000 characters, but that was apparently not
  good enough.

To manage notifications about this bug go to:
https://bugs.launchpad.net/zeitgeist-extensions/+bug/843668/+subscriptions


Follow ups

References