com.languagecomputer.api.text
Class TextComparator
java.lang.Object
com.languagecomputer.api.text.TextComparator
- All Implemented Interfaces:
- Comparator<Text>
public class TextComparator
- extends Object
- implements Comparator<Text>
A recommended sorting Comparator
for Text
objects that
occur in the same Document
. Doing a strict ordering of
Text
s is often not possible, spans can overlap in numerous
different ways. However, sorting can greatly benefit performance over using
an unsorted Collection
. Some applications, furthermore, require
sorting in order to present results to the user. As a compromise, the
TextComparator
sorts by:
- Start offset, from lowest to highest
- End offset, from highest to lowest
For example, given the sentence:
King Richard I of England was known as the Lionheart.
Then the following represents the order in which text spans would be ordered,
if given just these spans:
King Richard I
Richard I of England
Richard I
England
Lionheart
This sort ordering was chosen as a priority sort for eliminating overlapping
spans. In other words, if a system cannot deal with overlapping spans, then
one way to determine which should be kept is by iterating along a list that
uses this sort, and choosing items that do not intersect with any of the
previously chosen items. From the example above, this would yield the
following items:
King Richard I
England
Lionheart
One sort ordering and one non-intersecting span selection algorithm clearly
do not fit all needs. So this class is merely provided as an aide to the
user of one way this may be done.
NOTE: Again, this Comparator
assumes that all of the
Text
s to be compared will exist within the same Document
.
Attempting to use the TextComparator
with Text
s
from different Document
s will result in a
IllegalArgumentException
being throw. Specifically the
Text.getDocument()
method must return the exact same object for
all compared spans.
- Since:
- 1.0
- Author:
- Kirk Roberts
Method Summary |
int |
compare(Text text1,
Text text2)
Compares the two Text objects according to the sorting algorithm
described above. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
INSTANCE
public static final TextComparator INSTANCE
TextComparator
public TextComparator()
compare
public int compare(Text text1,
Text text2)
- Compares the two
Text
objects according to the sorting algorithm
described above.
- Specified by:
compare
in interface Comparator<Text>
- Parameters:
text1
- The first Text
to compare.text2
- The second Text
to compare.
- Returns:
- A number less than
0
if text1 should be
ordered before text2, greater than 0
if it
should be ordered after text2, or 0
if they are
equivalent and their relative order is not important.
Copyright © 2009. All Rights Reserved.