GATE
Version 3.1-2270

gate
Interface Document

All Superinterfaces:
Comparable, FeatureBearer, LanguageResource, NameBearer, Resource, Serializable, SimpleDocument
All Known Subinterfaces:
TextualDocument
All Known Implementing Classes:
DatabaseDocumentImpl, DocumentImpl

public interface Document
extends SimpleDocument

Represents the commonalities between all sorts of documents.


Field Summary
static String DOCUMENT_ENCODING_PARAMETER_NAME
           
static String DOCUMENT_END_OFFSET_PARAMETER_NAME
           
static String DOCUMENT_MARKUP_AWARE_PARAMETER_NAME
          The parameter name that determines whether or not a document is markup aware
static String DOCUMENT_PRESERVE_CONTENT_PARAMETER_NAME
           
static String DOCUMENT_REPOSITIONING_PARAMETER_NAME
           
static String DOCUMENT_START_OFFSET_PARAMETER_NAME
           
static String DOCUMENT_STRING_CONTENT_PARAMETER_NAME
           
 
Fields inherited from interface gate.SimpleDocument
DOCUMENT_URL_PARAMETER_NAME
 
Method Summary
 void addDocumentListener(DocumentListener l)
          Adds a DocumentListener to this document.
 void edit(Long start, Long end, DocumentContent replacement)
          Make changes to the content.
 Boolean getCollectRepositioningInfo()
          Get the collectiong and preserving of repositioning information for the Document.
 Boolean getMarkupAware()
          Get the markup awareness status of the Document.
 Map getNamedAnnotationSets()
          Returns a map with the named annotation sets
 Boolean getPreserveOriginalContent()
          Get the preserving of content status of the Document.
 Long getSourceUrlEndOffset()
          Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document.
 Long[] getSourceUrlOffsets()
          Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document.
 Long getSourceUrlStartOffset()
          Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document.
 void removeDocumentListener(DocumentListener l)
          Removes one of the previously registered document listeners.
 void setCollectRepositioningInfo(Boolean b)
          Allow/disallow collecting of repositioning information.
 void setMarkupAware(Boolean b)
          Make the document markup-aware.
 void setPreserveOriginalContent(Boolean b)
          Allow/disallow preserving of the original document content.
 void setSourceUrlEndOffset(Long sourceUrlEndOffset)
          Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document.
 void setSourceUrlStartOffset(Long sourceUrlStartOffset)
          Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document.
 String toXml()
          Returns a GateXml document.
 String toXml(Set aSourceAnnotationSet)
          Equivalent to toXml(aSourceAnnotationSet, true).
 String toXml(Set aSourceAnnotationSet, boolean includeFeatures)
          Returns an XML document aming to preserve the original markups( the original markup will be in the same place and format as it was before processing the document) and include (if possible) the annotations specified in the aSourceAnnotationSet.
 
Methods inherited from interface gate.SimpleDocument
getAnnotations, getAnnotations, getAnnotationSetNames, getContent, getSourceUrl, removeAnnotationSet, setContent, setSourceUrl
 
Methods inherited from interface gate.LanguageResource
getDataStore, getLRPersistenceId, getParent, isModified, setDataStore, setLRPersistenceId, setParent, sync
 
Methods inherited from interface gate.Resource
cleanup, getParameterValue, init, setParameterValue, setParameterValues
 
Methods inherited from interface gate.util.FeatureBearer
getFeatures, setFeatures
 
Methods inherited from interface gate.util.NameBearer
getName, setName
 
Methods inherited from interface java.lang.Comparable
compareTo
 

Field Detail

DOCUMENT_MARKUP_AWARE_PARAMETER_NAME

static final String DOCUMENT_MARKUP_AWARE_PARAMETER_NAME
The parameter name that determines whether or not a document is markup aware

See Also:
Constant Field Values

DOCUMENT_ENCODING_PARAMETER_NAME

static final String DOCUMENT_ENCODING_PARAMETER_NAME
See Also:
Constant Field Values

DOCUMENT_PRESERVE_CONTENT_PARAMETER_NAME

static final String DOCUMENT_PRESERVE_CONTENT_PARAMETER_NAME
See Also:
Constant Field Values

DOCUMENT_STRING_CONTENT_PARAMETER_NAME

static final String DOCUMENT_STRING_CONTENT_PARAMETER_NAME
See Also:
Constant Field Values

DOCUMENT_REPOSITIONING_PARAMETER_NAME

static final String DOCUMENT_REPOSITIONING_PARAMETER_NAME
See Also:
Constant Field Values

DOCUMENT_START_OFFSET_PARAMETER_NAME

static final String DOCUMENT_START_OFFSET_PARAMETER_NAME
See Also:
Constant Field Values

DOCUMENT_END_OFFSET_PARAMETER_NAME

static final String DOCUMENT_END_OFFSET_PARAMETER_NAME
See Also:
Constant Field Values
Method Detail

getSourceUrlOffsets

Long[] getSourceUrlOffsets()
Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document.


getSourceUrlStartOffset

Long getSourceUrlStartOffset()
Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document. This method gets the start offset.


getSourceUrlEndOffset

Long getSourceUrlEndOffset()
Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document. This method gets the end offset.


getNamedAnnotationSets

Map getNamedAnnotationSets()
Returns a map with the named annotation sets


setMarkupAware

void setMarkupAware(Boolean b)
Make the document markup-aware. This will trigger the creation of a DocumentFormat object at Document initialisation time; the DocumentFormat object will unpack the markup in the Document and add it as annotations. Documents are not markup-aware by default.

Parameters:
b - markup awareness status.

getMarkupAware

Boolean getMarkupAware()
Get the markup awareness status of the Document.

Returns:
whether the Document is markup aware.

setPreserveOriginalContent

void setPreserveOriginalContent(Boolean b)
Allow/disallow preserving of the original document content. If is true the original content will be retrieved from the DocumentContent object and preserved as document feature.


getPreserveOriginalContent

Boolean getPreserveOriginalContent()
Get the preserving of content status of the Document.

Returns:
whether the Document should preserve it's original content.

setCollectRepositioningInfo

void setCollectRepositioningInfo(Boolean b)
Allow/disallow collecting of repositioning information. If is true information will be retrieved and preserved as document feature.
Preserving of repositioning information give the possibilities for converting of coordinates between the original document content and extracted from the document text.


getCollectRepositioningInfo

Boolean getCollectRepositioningInfo()
Get the collectiong and preserving of repositioning information for the Document.
Preserving of repositioning information give the possibilities for converting of coordinates between the original document content and extracted from the document text.

Returns:
whether the Document should collect and preserve information.

toXml

String toXml()
Returns a GateXml document. This document is actually a serialization of a Gate Document in XML.

Returns:
a string representing a Gate Xml document

toXml

String toXml(Set aSourceAnnotationSet,
             boolean includeFeatures)
Returns an XML document aming to preserve the original markups( the original markup will be in the same place and format as it was before processing the document) and include (if possible) the annotations specified in the aSourceAnnotationSet. Warning: Annotations from the aSourceAnnotationSet will be lost if they will cause a crosed over situation.

Parameters:
aSourceAnnotationSet - is an annotation set containing all the annotations that will be combined with the original marup set.
includeFeatures - determines whether or not features and gate IDs of the annotations should be included as attributes on the tags or not. If false, then only the annotation types are exported as tags, with no attributes.
Returns:
a string representing an XML document containing the original markup + dumped annotations form the aSourceAnnotationSet

toXml

String toXml(Set aSourceAnnotationSet)
Equivalent to toXml(aSourceAnnotationSet, true).


edit

void edit(Long start,
          Long end,
          DocumentContent replacement)
          throws InvalidOffsetException
Make changes to the content.

Throws:
InvalidOffsetException

addDocumentListener

void addDocumentListener(DocumentListener l)
Adds a DocumentListener to this document. All the registered listeners will be notified of changes occured to the document.


removeDocumentListener

void removeDocumentListener(DocumentListener l)
Removes one of the previously registered document listeners.


setSourceUrlEndOffset

void setSourceUrlEndOffset(Long sourceUrlEndOffset)
Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document. This method sets the end offset.


setSourceUrlStartOffset

void setSourceUrlStartOffset(Long sourceUrlStartOffset)
Documents may be packed within files; in this case an optional pair of offsets refer to the location of the document. This method sets the start offset.


GATE
Version 3.1-2270