com.languagecomputer.api.text
Class DefaultText

java.lang.Object
  extended by com.languagecomputer.api.text.DefaultText
All Implemented Interfaces:
Text
Direct Known Subclasses:
DefaultAnswer, DefaultAttribute, DefaultDocument, DefaultEntity, DefaultEvent, DefaultMentionChain, DefaultQuestionAnswerPair, DefaultSpatialSpan, DefaultTemporalSpan, GenericDefaultText

public class DefaultText
extends Object
implements Text

Default implementation of a Text. Alternatively, the Text interface may be implemented in order to be backed by a different mechanism (i.e., for speed and/or memory performance based on the individual system).

Since:
1.0
Author:
Kirk Roberts

Constructor Summary
  DefaultText()
          Creates a new DefaultText.
protected DefaultText(AnnotationType annType)
          Subclass-only constructor to create a DefaultText with a specific AnnotationType.
 
Method Summary
 AnnotationType getAnnotationType()
          Returns the AnnotationType that describes this Text.
<T extends Text>
Collection<T>
getCongruentAnnotations(AnnotationType<T> type)
          Returns all the annotations congruent with this Text that correspond to the given AnnotationType.
 Document getDocument()
          Returns the Document in which this Text exists.
 String getDocumentID()
          Returns the Document ID that identifies this Text.
 int getEndCharOffset()
          Returns the (exclusive) end character offset for this Text object within the (processed) Document.
<T extends Text>
Collection<T>
getIntersectingAnnotations(AnnotationType<T> type)
          Returns all the annotations that intersect this Text and correspond to the given AnnotationType.
 String getRawString()
          Returns the raw String value that this Text spans.
 int getStartCharOffset()
          Returns the start character offset for this Text object within the (processed) Document.
<T extends Text>
Collection<T>
getSubAnnotations(AnnotationType<T> type)
          Returns all the annotations within this Text that correspond to the given AnnotationType.
<T extends Text>
Collection<T>
getSuperAnnotations(AnnotationType<T> type)
          Returns all the annotations that contain this Text as a subspan and correspond to the given AnnotationType.
 void setAnnotationType(AnnotationType annType)
          Sets the AnnotationType.
 void setDocument(Document document)
          Sets the Document without adding this DefaultText to it.
 void setEndCharOffset(int endCharOffset)
          Sets the end character offset.
 void setRawString(String rawString)
          Sets the raw string, Not necessary for non-Documents.
 void setStartCharOffset(int startCharOffset)
          Sets the start character offset.
 String toString()
          Returns a String representation of a DefaultText.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DefaultText

public DefaultText()
Creates a new DefaultText.


DefaultText

protected DefaultText(AnnotationType annType)
Subclass-only constructor to create a DefaultText with a specific AnnotationType. Use setAnnotationType(com.languagecomputer.api.text.AnnotationType) instead of this constructor for non-subclass constructor calls.

Parameters:
annType - AnnotationType to use for this subclass of DefaultText.
Method Detail

setAnnotationType

public void setAnnotationType(AnnotationType annType)
Sets the AnnotationType.

Parameters:
annType - AnnotationType to use for this DefaultText.
See Also:
Text.getAnnotationType(), AnnotationType

getAnnotationType

public AnnotationType getAnnotationType()
Returns the AnnotationType that describes this Text.

Specified by:
getAnnotationType in interface Text
Returns:
The AnnotationType of this Text.
See Also:
AnnotationType

getDocumentID

public String getDocumentID()
Returns the Document ID that identifies this Text.

Specified by:
getDocumentID in interface Text
Returns:
The ID for the Document that contains this Text.

setDocument

public void setDocument(Document document)
Sets the Document without adding this DefaultText to it. To attach this DefaultText to the Document, use DefaultDocument.addAnnotation(com.languagecomputer.api.text.Text) instead.

Parameters:
document - The Document to use for this DefaultText.
See Also:
Text.getDocument()

getDocument

public Document getDocument()
Returns the Document in which this Text exists.

Specified by:
getDocument in interface Text
Returns:
The Document that contains this Text.

setStartCharOffset

public void setStartCharOffset(int startCharOffset)
Sets the start character offset.

Parameters:
startCharOffset - The start character offset to use for this DefaultText.
See Also:
Text.getStartCharOffset()

getStartCharOffset

public int getStartCharOffset()
Returns the start character offset for this Text object within the (processed) Document. This offset does not necessarily line up with the start character offset from the unprocessed document/file.

Specified by:
getStartCharOffset in interface Text
Returns:
The inclusive start offset.

setEndCharOffset

public void setEndCharOffset(int endCharOffset)
Sets the end character offset.

Parameters:
endCharOffset - The end character offset to use for this DefaultText.
See Also:
Text.getEndCharOffset()

getEndCharOffset

public int getEndCharOffset()
Returns the (exclusive) end character offset for this Text object within the (processed) Document. This offset does not necessarily line up with the start character offset from the unprocessed document/file.

Specified by:
getEndCharOffset in interface Text
Returns:
The exclusive end offset.

setRawString

public void setRawString(String rawString)
Sets the raw string, Not necessary for non-Documents. The start and end character offsets can be used to determine the raw string for a DefaultText. Allowed to be overriden for system flexibility.

Parameters:
rawString - The raw String to use for this DefaultText.
See Also:
Text.getRawString()

getRawString

public String getRawString()
Returns the raw String value that this Text spans. Includes whitespace (spaces, tabs, newlines, etc) but not the original markup (html tags, pdf and MS word markup, etc). Might contain additional newlines along paragraph boundaries for some large documents.

Specified by:
getRawString in interface Text
Returns:
The raw string with no mark-up.

getCongruentAnnotations

public <T extends Text> Collection<T> getCongruentAnnotations(AnnotationType<T> type)
Returns all the annotations congruent with this Text that correspond to the given AnnotationType.

Specified by:
getCongruentAnnotations in interface Text
Type Parameters:
T - The Text sub-class corresponding to the given AnnotationType.
Parameters:
type - The AnnotationType that all returned items will match.
Returns:
A Collection of objects matching the given AnnotationType. The annotations will be in a semi-sorted order. This means that non-intersecting objects will be sorted by their order in the document. No guarantee will be placed on the order of intersecting objects.

getSubAnnotations

public <T extends Text> Collection<T> getSubAnnotations(AnnotationType<T> type)
Returns all the annotations within this Text that correspond to the given AnnotationType.

Specified by:
getSubAnnotations in interface Text
Type Parameters:
T - The Text sub-class corresponding to the given AnnotationType.
Parameters:
type - The AnnotationType that all returned items will match.
Returns:
A Collection of objects matching the given AnnotationType. The annotations will be in a semi-sorted order. This means that non-intersecting objects will be sorted by their order in the document. No guarantee will be placed on the order of intersecting objects.

getSuperAnnotations

public <T extends Text> Collection<T> getSuperAnnotations(AnnotationType<T> type)
Returns all the annotations that contain this Text as a subspan and correspond to the given AnnotationType.

Specified by:
getSuperAnnotations in interface Text
Type Parameters:
T - The Text sub-class corresponding to the given AnnotationType.
Parameters:
type - The AnnotationType that all returned items will match.
Returns:
A Collection of objects matching the given AnnotationType. The annotations will be in a semi-sorted order. This means that non-intersecting objects will be sorted by their order in the document. No guarantee will be placed on the order of intersecting objects.

getIntersectingAnnotations

public <T extends Text> Collection<T> getIntersectingAnnotations(AnnotationType<T> type)
Returns all the annotations that intersect this Text and correspond to the given AnnotationType.

Specified by:
getIntersectingAnnotations in interface Text
Type Parameters:
T - The Text sub-class corresponding to the given AnnotationType.
Parameters:
type - The AnnotationType that all returned items will match.
Returns:
A Collection of objects matching the given AnnotationType. The annotations will be in a semi-sorted order. This means that non-intersecting objects will be sorted by their order in the document. No guarantee will be placed on the order of intersecting objects.

toString

public String toString()
Returns a String representation of a DefaultText.

Overrides:
toString in class Object


Copyright © 2009. All Rights Reserved.