com.languagecomputer.api.text
Class TextComparator

java.lang.Object
  extended by com.languagecomputer.api.text.TextComparator
All Implemented Interfaces:
Comparator<Text>

public class TextComparator
extends Object
implements Comparator<Text>

A recommended sorting Comparator for Text objects that occur in the same Document. Doing a strict ordering of Texts is often not possible, spans can overlap in numerous different ways. However, sorting can greatly benefit performance over using an unsorted Collection. Some applications, furthermore, require sorting in order to present results to the user. As a compromise, the TextComparator sorts by:

  1. Start offset, from lowest to highest
  2. End offset, from highest to lowest
For example, given the sentence:
   King Richard I of England was known as the Lionheart.
 
Then the following represents the order in which text spans would be ordered, if given just these spans:
  1. King Richard I
  2. Richard I of England
  3. Richard I
  4. England
  5. Lionheart
This sort ordering was chosen as a priority sort for eliminating overlapping spans. In other words, if a system cannot deal with overlapping spans, then one way to determine which should be kept is by iterating along a list that uses this sort, and choosing items that do not intersect with any of the previously chosen items. From the example above, this would yield the following items: One sort ordering and one non-intersecting span selection algorithm clearly do not fit all needs. So this class is merely provided as an aide to the user of one way this may be done.

NOTE: Again, this Comparator assumes that all of the Texts to be compared will exist within the same Document. Attempting to use the TextComparator with Texts from different Documents will result in a IllegalArgumentException being throw. Specifically the Text.getDocument() method must return the exact same object for all compared spans.

Since:
1.0
Author:
Kirk Roberts

Field Summary
static TextComparator INSTANCE
           
 
Constructor Summary
TextComparator()
           
 
Method Summary
 int compare(Text text1, Text text2)
          Compares the two Text objects according to the sorting algorithm described above.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.util.Comparator
equals
 

Field Detail

INSTANCE

public static final TextComparator INSTANCE
Constructor Detail

TextComparator

public TextComparator()
Method Detail

compare

public int compare(Text text1,
                   Text text2)
Compares the two Text objects according to the sorting algorithm described above.

Specified by:
compare in interface Comparator<Text>
Parameters:
text1 - The first Text to compare.
text2 - The second Text to compare.
Returns:
A number less than 0 if text1 should be ordered before text2, greater than 0 if it should be ordered after text2, or 0 if they are equivalent and their relative order is not important.


Copyright © 2009. All Rights Reserved.