GATE
Version 3.1-2270

gate.corpora
Class MSWordDocumentFormat

java.lang.Object
  extended by gate.util.AbstractFeatureBearer
      extended by gate.creole.AbstractResource
          extended by gate.creole.AbstractLanguageResource
              extended by gate.DocumentFormat
                  extended by gate.corpora.MSWordDocumentFormat
All Implemented Interfaces:
LanguageResource, Resource, FeatureBearer, NameBearer, Serializable

public class MSWordDocumentFormat
extends DocumentFormat

See Also:
Serialized Form

Field Summary
 
Fields inherited from class gate.DocumentFormat
element2StringMap, isGateXmlDocument, magic2mimeTypeMap, markupElementsMap, mimeString2ClassHandlerMap, mimeString2mimeTypeMap, suffixes2mimeTypeMap
 
Fields inherited from class gate.creole.AbstractLanguageResource
dataStore, lrPersistentId
 
Fields inherited from class gate.creole.AbstractResource
name
 
Constructor Summary
MSWordDocumentFormat()
           
 
Method Summary
 Resource init()
          Initialise this resource, and return it.
 Boolean supportsRepositioning()
          The MSWord Document Format does not support repositioning info.
 void unpackMarkup(Document doc)
          Unpack the markup in the document.
 void unpackMarkup(Document doc, RepositioningInfo repInfo, RepositioningInfo ampCodingInfo)
           
 
Methods inherited from class gate.DocumentFormat
addStatusListener, areEqual, decideBetweenThreeMimeTypes, decideBetweenTwoMimeTypes, fireStatusChanged, getDocumentFormat, getDocumentFormat, getDocumentFormat, getElement2StringMap, getFeatures, getMarkupElementsMap, getMimeType, getShouldCollectRepositioning, guessTypeUsingMagicNumbers, removeStatusListener, runMagicNumbers, setElement2StringMap, setFeatures, setMarkupElementsMap, setMimeType, setShouldCollectRepositioning, unpackMarkup
 
Methods inherited from class gate.creole.AbstractLanguageResource
cleanup, getDataStore, getLRPersistenceId, getParent, isModified, setDataStore, setLRPersistenceId, setParent, sync
 
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, getBeanInfo, getName, getParameterValue, getParameterValue, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gate.LanguageResource
getDataStore, getLRPersistenceId, getParent, isModified, setDataStore, setLRPersistenceId, setParent, sync
 
Methods inherited from interface gate.Resource
cleanup, getParameterValue, setParameterValue, setParameterValues
 
Methods inherited from interface gate.util.NameBearer
getName, setName
 

Constructor Detail

MSWordDocumentFormat

public MSWordDocumentFormat()
Method Detail

init

public Resource init()
              throws ResourceInstantiationException
Initialise this resource, and return it. Registers this format unpacker with the system.

Specified by:
init in interface Resource
Overrides:
init in class AbstractResource
Throws:
ResourceInstantiationException

supportsRepositioning

public Boolean supportsRepositioning()
The MSWord Document Format does not support repositioning info.

Overrides:
supportsRepositioning in class DocumentFormat
Returns:
false.

unpackMarkup

public void unpackMarkup(Document doc)
                  throws DocumentFormatException
Unpack the markup in the document. This converts markup from the native format (e.g. XML, RTF) into annotations in GATE format. Uses the markupElementsMap to determine which elements to convert, and what annotation type names to use.

Specified by:
unpackMarkup in class DocumentFormat
Throws:
DocumentFormatException

unpackMarkup

public void unpackMarkup(Document doc,
                         RepositioningInfo repInfo,
                         RepositioningInfo ampCodingInfo)
                  throws DocumentFormatException
Specified by:
unpackMarkup in class DocumentFormat
Throws:
DocumentFormatException

GATE
Version 3.1-2270