qpdfview team mailing list archive
-
qpdfview team
-
Mailing list archive
-
Message #00697
Re: Request for review of DjVu method "findText"
Hello again,
Am 24.01.2015 um 00:20 schrieb Adam Reichold:
> Hello Razi,
>
> Am 23.01.2015 um 23:37 schrieb Seyyed Razi Alavizadeh:
>> Hello Adam,
>>
>> Sorry for my late reply.
>
> No, thanks for looking into it!
>
>> I tested search with trunk 1834:
>> If I enable "Whole Words" and I search for "a space" then results
>> contains something like for example "a KB-space", I'm not sure if it is
>> expected or not (?) but it's because "wordBegins&&wordEnds" is true when
>> partial matching for "space".
>
> Yes, the thing is that after your last change, we began to accept
> non-letter (instead of whitespace) as word boundaries which I found
> pretty useful. Of course this does not match what DjVuLibre thinks are
> words and so we have two (DjVu) words "a" and "KB-space" on which the
> second one fits when using whole words matching because of the hyphen.
>
> Personally, I think that is actually useful and we can live with the
> slightly larger number of false positives. Otherwise we'd have to revert
> back to match our words with DjVuLibre's words.
>
>> Another issue is highlighting search results within tree-view when
>> searching "several words" with "Whole Words" disabled.
>> For example the search phrase is "a space" one of result is "a subspace"
>> that is not highlighted.
>
> Yes, that seems because "emphasizeText" does not split the text to
> search for into individual words before highlighting. I see a few
> options here:
>
> 1) Make the "emphasizeText" function exactly as complicated as DjVu's
> "findText" function with the possibility of breaking synchronicity with
> other plugins.
>
> 2) Make the "findText" function's matching less fuzzy only finding
> proper substrings (with consecutive whitespace being counted as one
> symbol) and stop matching something like "a space" with "a KB-space".
>
> 3) Not only query the surroundingText but also the text to be
> highlighted using "Page::text" with the actual result rectangle which
> can then be highlighted a proper substring.
>
> 4) Ignore the differences in highlighting the more outlying cases.
>
> Personally, I dislike option 2) and feel like 3) is the cleanest way of
> going about this (also w.r.t. to other plugins) but I am unsure about
> the performance implications. 1) may be a practical compromise. What do
> you think?
Attached, you also find what I would consider a clean approach to
separating the plug-in internal matching and the text highlighting for
you to have a look at and consider performance-wise.
Best regards, Adam.
>> Best Regards,
>> Razi.
>>
>
> Best regards, Adam.
>
>>
>> 2015-01-11 18:32 GMT+03:30 Adam Reichold <adam.reichold@xxxxxxxxxxx
>> <mailto:adam.reichold@xxxxxxxxxxx>>:
>>
>> Hello Razi,
>>
>> I am currently working on getting the whole-words search function of
>> Poppler exposed in the Qt frontends. I did already add the necessary
>> UI elements to qpdfview and started work on the DjVu backend.
>>
>> As you know, the DjVu search function, i.e. the "findText" method, has
>> always been pretty "hand waving". To make it fulfil the requirements
>> made by the UI at least to some degree, I extended it to respect the
>> whole words flag and I also tried to restore the matching of search
>> texts that span several words (which was gone after the last change to
>> make it match words within words at non-letters).
>>
>> It is now pretty complicated again (doing a word-by-word possibly with
>> words being substrings of DjVu words match) and I would be grateful if
>> you could review it as well when you find some spare time to do so.
>> Thanks!
>>
>> Best regards, Adam.
>>
>> --
>> Mailing list: https://launchpad.net/~qpdfview
>> Post to : qpdfview@xxxxxxxxxxxxxxxxxxx
>> <mailto:qpdfview@xxxxxxxxxxxxxxxxxxx>
>> Unsubscribe : https://launchpad.net/~qpdfview
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>>
>>
>> --
>> Alavizadeh, Sayed Razi
>> My Blog: http://pozh.org <http://pozh.org/>
>> Saaghar (نرمافزار شعر): http://saaghar.pozh.org/
>> Saaghar Fan Page: http://www.facebook.com/saaghar.p
>> Saaghar Mailing List: http://groups.google.com/group/saaghar
>>
>
>
>
=== modified file 'sources/documentview.cpp'
--- sources/documentview.cpp 2015-01-11 13:17:34 +0000
+++ sources/documentview.cpp 2015-01-24 11:07:39 +0000
@@ -820,11 +820,11 @@
return m_searchTask->wholeWords();
}
-QString DocumentView::surroundingText(int page, const QRectF& rect) const
+QPair< QString, QString > DocumentView::surroundingText(int page, const QRectF& rect) const
{
if(page < 1 || page > m_pages.size() || rect.isEmpty())
{
- return QString();
+ return qMakePair(QString(), QString());
}
// Fetch at most half of a line as centered on the given rectangle as possible.
@@ -834,7 +834,10 @@
const QRectF surroundingRect(x, rect.top(), width, rect.height());
- return m_pages.at(page - 1)->text(surroundingRect).simplified();
+ const QString& matchedText = m_pages.at(page - 1)->text(rect).simplified();
+ const QString& surroundingText = m_pages.at(page - 1)->text(surroundingRect).simplified();
+
+ return qMakePair(matchedText, surroundingText);
}
void DocumentView::show()
=== modified file 'sources/documentview.h'
--- sources/documentview.h 2015-01-11 13:17:34 +0000
+++ sources/documentview.h 2015-01-24 11:05:05 +0000
@@ -134,7 +134,7 @@
bool searchMatchCase() const;
bool searchWholeWords() const;
- QString surroundingText(int page, const QRectF& rect) const;
+ QPair< QString, QString > surroundingText(int page, const QRectF& rect) const;
signals:
void documentChanged();
=== modified file 'sources/miscellaneous.cpp'
--- sources/miscellaneous.cpp 2015-01-17 14:08:33 +0000
+++ sources/miscellaneous.cpp 2015-01-24 11:18:38 +0000
@@ -56,7 +56,7 @@
return true;
}
-void emphasizeText(const QString& text, bool matchCase, bool wholeWords, const QString& surroundingText, QTextLayout& textLayout)
+void emphasizeText(const QString& matchedText, const QString& surroundingText, QTextLayout& textLayout)
{
QFont font = textLayout.font();
font.setWeight(QFont::Light);
@@ -65,30 +65,14 @@
QList< QTextLayout::FormatRange > additionalFormats;
- const Qt::CaseSensitivity sensitivity = matchCase ? Qt::CaseSensitive : Qt::CaseInsensitive;
- int index = 0;
-
- while((index = surroundingText.indexOf(text, index, sensitivity)) != -1)
+ for(int index = 0; (index = surroundingText.indexOf(matchedText, index)) != -1; index += matchedText.length())
{
- const int nextIndex = index + text.length();
-
- const bool wordBegins = index == 0 || !surroundingText.at(index - 1).isLetterOrNumber();
- const bool wordEnds = nextIndex == surroundingText.length() || !surroundingText.at(nextIndex).isLetterOrNumber();
-
- if(!wholeWords || (wordBegins && wordEnds))
- {
- QTextLayout::FormatRange formatRange;
- formatRange.start = index;
- formatRange.length = text.length();
- formatRange.format.setFontWeight(QFont::Bold);
-
- additionalFormats.append(formatRange);
- }
-
- if((index = nextIndex) >= surroundingText.length())
- {
- break;
- }
+ QTextLayout::FormatRange formatRange;
+ formatRange.start = index;
+ formatRange.length = matchedText.length();
+ formatRange.format.setFontWeight(QFont::Bold);
+
+ additionalFormats.append(formatRange);
}
textLayout.setAdditionalFormats(additionalFormats);
@@ -647,15 +631,12 @@
return;
}
- const QString text = index.data(SearchModel::TextRole).toString();
+ const QString matchedText = index.data(SearchModel::MatchedTextRole).toString();
const QString surroundingText = index.data(SearchModel::SurroundingTextRole).toString();
- if(!text.isEmpty() && !surroundingText.isEmpty())
+ if(!matchedText.isEmpty() && !surroundingText.isEmpty())
{
- const bool matchCase = index.data(SearchModel::MatchCaseRole).toBool();
- const bool wholeWords = index.data(SearchModel::WholeWordsRole).toBool();
-
- paintSurroundingText(painter, option, text, surroundingText, matchCase, wholeWords);
+ paintSurroundingText(painter, option, matchedText, surroundingText);
return;
}
}
@@ -675,7 +656,7 @@
}
void SearchItemDelegate::paintSurroundingText(QPainter* painter, const QStyleOptionViewItem& option,
- const QString& text, const QString& surroundingText, bool matchCase, bool wholeWords) const
+ const QString& matchedText, const QString& surroundingText) const
{
const int textMargin = QApplication::style()->pixelMetric(QStyle::PM_FocusFrameHMargin) + 1;
const QRect textRect = option.rect.adjusted(textMargin, 0, -textMargin, 0);
@@ -691,7 +672,7 @@
textLayout.setText(elidedText);
textLayout.setFont(option.font);
- emphasizeText(text, matchCase, wholeWords, surroundingText, textLayout);
+ emphasizeText(matchedText, surroundingText, textLayout);
textLayout.beginLayout();
=== modified file 'sources/miscellaneous.h'
--- sources/miscellaneous.h 2015-01-17 12:47:36 +0000
+++ sources/miscellaneous.h 2015-01-24 11:17:40 +0000
@@ -345,7 +345,7 @@
void paintProgress(QPainter* painter, const QStyleOptionViewItem& option,
int progress) const;
void paintSurroundingText(QPainter* painter, const QStyleOptionViewItem& option,
- const QString& text, const QString& surroundingText, bool matchCase, bool wholeWords) const;
+ const QString& matchedText, const QString& surroundingText) const;
};
=== modified file 'sources/searchmodel.cpp'
--- sources/searchmodel.cpp 2015-01-11 13:17:34 +0000
+++ sources/searchmodel.cpp 2015-01-24 11:19:51 +0000
@@ -174,6 +174,8 @@
return view->searchMatchCase();
case WholeWordsRole:
return view->searchWholeWords();
+ case MatchedTextRole:
+ return fetchMatchedText(view, result);
case SurroundingTextRole:
return fetchSurroundingText(view, result);
case Qt::ToolTipRole:
@@ -377,7 +379,9 @@
return;
}
- m_textCache.insert(job.key, job.object, job.object->length());
+ const int cost = job.object->first.length() + job.object->second.length();
+
+ m_textCache.insert(job.key, job.object, cost);
emit dataChanged(createIndex(0, 0, view), createIndex(results->count() - 1, 0, view));
}
@@ -424,19 +428,28 @@
return createIndex(row, 0);
}
+QString SearchModel::fetchMatchedText(DocumentView* view, const SearchModel::Result& result) const
+{
+ const TextCacheObject* object = fetchText(view, result);
+
+ return object != 0 ? object->first : QString();
+}
+
QString SearchModel::fetchSurroundingText(DocumentView* view, const Result& result) const
{
- if(view == 0)
- {
- return QString();
- }
-
+ const TextCacheObject* object = fetchText(view, result);
+
+ return object != 0 ? object->second : QString();
+}
+
+const SearchModel::TextCacheObject* SearchModel::fetchText(DocumentView* view, const SearchModel::Result& result) const
+{
const TextCacheKey key = textCacheKey(view, result);
const TextCacheObject* object = m_textCache.object(key);
if(object != 0)
{
- return *object;
+ return object;
}
if(m_textWatchers.size() < 20 && !m_textWatchers.contains(key))
@@ -449,7 +462,7 @@
watcher->setFuture(QtConcurrent::run(textJob, key, result));
}
- return QLatin1String("...");
+ return 0;
}
inline SearchModel::TextCacheKey SearchModel::textCacheKey(DocumentView* view, const Result& result)
@@ -465,9 +478,9 @@
SearchModel::TextJob SearchModel::textJob(const TextCacheKey& key, const Result& result)
{
- const QString surroundingText = key.first->surroundingText(result.first, result.second);
+ const QPair< QString, QString > text = key.first->surroundingText(result.first, result.second);
- return TextJob(key, new QString(surroundingText));
+ return TextJob(key, new TextCacheObject(text));
}
} // qpdfview
=== modified file 'sources/searchmodel.h'
--- sources/searchmodel.h 2015-01-11 13:17:34 +0000
+++ sources/searchmodel.h 2015-01-24 11:14:34 +0000
@@ -58,6 +58,7 @@
TextRole,
MatchCaseRole,
WholeWordsRole,
+ MatchedTextRole,
SurroundingTextRole
};
@@ -107,7 +108,7 @@
typedef QPair< DocumentView*, QByteArray > TextCacheKey;
- typedef QString TextCacheObject;
+ typedef QPair< QString, QString > TextCacheObject;
struct TextJob
{
@@ -124,8 +125,11 @@
mutable QCache< TextCacheKey, TextCacheObject > m_textCache;
mutable QHash< TextCacheKey, TextWatcher* > m_textWatchers;
+ QString fetchMatchedText(DocumentView* view, const Result& result) const;
QString fetchSurroundingText(DocumentView* view, const Result& result) const;
+ const TextCacheObject* fetchText(DocumentView* view, const Result& result) const;
+
static TextCacheKey textCacheKey(DocumentView* view, const Result& result);
static TextJob textJob(const TextCacheKey& key, const Result& result);
Attachment:
signature.asc
Description: OpenPGP digital signature
References