GATE
Version 3.1-2270

gate.creole.gazetteer
Class FlexibleGazetteer

java.lang.Object
  extended by gate.util.AbstractFeatureBearer
      extended by gate.creole.AbstractResource
          extended by gate.creole.AbstractProcessingResource
              extended by gate.creole.AbstractLanguageAnalyser
                  extended by gate.creole.gazetteer.FlexibleGazetteer
All Implemented Interfaces:
ANNIEConstants, Executable, LanguageAnalyser, ProcessingResource, Resource, FeatureBearer, NameBearer, Serializable

public class FlexibleGazetteer
extends AbstractLanguageAnalyser
implements ProcessingResource

Title: Flexible Gazetteer

The Flexible Gazetteer provides users with the flexibility to choose

their own customized input and an external Gazetteer. For example,

the user might want to replace words in the text with their base

forms (which is an output of the Morphological Analyser) or to segment

a Chinese text (using the Chinese Tokeniser) before running the

Gazetteer on the Chinese text.

The Flexible Gazetteer performs lookup over a document based on the

values of an arbitrary feature of an arbitrary annotation type, by

using an externally provided gazetteer. It is important to use an

external gazetteer as this allows the use of any type of gazetteer

(e.g. an Ontological gazetteer).

Version:
1.0
Author:
niraj aswani
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class gate.creole.AbstractProcessingResource
AbstractProcessingResource.InternalStatusListener, AbstractProcessingResource.IntervalProgressListener
 
Field Summary
 
Fields inherited from class gate.creole.AbstractLanguageAnalyser
corpus
 
Fields inherited from class gate.creole.AbstractProcessingResource
interrupted
 
Fields inherited from class gate.creole.AbstractResource
name
 
Fields inherited from class gate.util.AbstractFeatureBearer
features
 
Fields inherited from interface gate.creole.ANNIEConstants
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PR_NAMES, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME
 
Constructor Summary
FlexibleGazetteer()
          Constructor
 
Method Summary
 void execute()
          This method runs the gazetteer.
 Document getDocument()
          Returns the document set up by user to work on
 Gazetteer getGazetteerInst()
           
 String getInputAnnotationSetName()
          Returns the inputAnnotationSetName
 List getInputFeatureNames()
          Returns the feature names that are provided by the user to use their values to replace their actual strings in the document
 String getOutputAnnotationSetName()
          Returns the outputAnnotationSetName
 Iterator getTokenIterator(Document doc, String annotationSetName)
          This method takes the document and the annotationSetName and then creates a interator for the annotations available in the document under the provided annotationSetName
 Resource init()
          Does the actual loading and parsing of the lists.
 void setDocument(Document doc)
          Sets the document to work on
 void setGazetteerInst(Gazetteer gazetteerInst)
           
 void setInputAnnotationSetName(String annName)
          sets the inputAnnotationSetName
 void setInputFeatureNames(List inputs)
          Feature names for example: Token.string, Token.root etc...
 void setOutputAnnotationSetName(String annName)
          sets the outputAnnotationSetName
 
Methods inherited from class gate.creole.AbstractLanguageAnalyser
getCorpus, setCorpus
 
Methods inherited from class gate.creole.AbstractProcessingResource
addProgressListener, addStatusListener, cleanup, fireProcessFinished, fireProgressChanged, fireStatusChanged, interrupt, isInterrupted, reInit, removeProgressListener, removeStatusListener
 
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, getBeanInfo, getName, getParameterValue, getParameterValue, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners
 
Methods inherited from class gate.util.AbstractFeatureBearer
getFeatures, setFeatures
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gate.ProcessingResource
reInit
 
Methods inherited from interface gate.Resource
cleanup, getParameterValue, setParameterValue, setParameterValues
 
Methods inherited from interface gate.util.FeatureBearer
getFeatures, setFeatures
 
Methods inherited from interface gate.util.NameBearer
getName, setName
 
Methods inherited from interface gate.Executable
interrupt, isInterrupted
 

Constructor Detail

FlexibleGazetteer

public FlexibleGazetteer()
Constructor

Method Detail

init

public Resource init()
              throws ResourceInstantiationException
Does the actual loading and parsing of the lists. This method must be called before the gazetteer can be used

Specified by:
init in interface Resource
Overrides:
init in class AbstractProcessingResource
Throws:
ResourceInstantiationException

execute

public void execute()
             throws ExecutionException
This method runs the gazetteer. It assumes that all the needed parameters are set. If they are not, an exception will be fired.

Specified by:
execute in interface Executable
Overrides:
execute in class AbstractProcessingResource
Throws:
ExecutionException

setDocument

public void setDocument(Document doc)
Sets the document to work on

Specified by:
setDocument in interface LanguageAnalyser
Overrides:
setDocument in class AbstractLanguageAnalyser
Parameters:
doc -

getDocument

public Document getDocument()
Returns the document set up by user to work on

Specified by:
getDocument in interface LanguageAnalyser
Overrides:
getDocument in class AbstractLanguageAnalyser
Returns:
a Document

setOutputAnnotationSetName

public void setOutputAnnotationSetName(String annName)
sets the outputAnnotationSetName

Parameters:
annName -

getOutputAnnotationSetName

public String getOutputAnnotationSetName()
Returns the outputAnnotationSetName

Returns:
a String value.

setInputAnnotationSetName

public void setInputAnnotationSetName(String annName)
sets the inputAnnotationSetName

Parameters:
annName -

getInputAnnotationSetName

public String getInputAnnotationSetName()
Returns the inputAnnotationSetName

Returns:
a String value.

setInputFeatureNames

public void setInputFeatureNames(List inputs)
Feature names for example: Token.string, Token.root etc... Values of these features should be used to replace the actual string of these features. This method allows a user to set the name of such features

Parameters:
inputs -

getInputFeatureNames

public List getInputFeatureNames()
Returns the feature names that are provided by the user to use their values to replace their actual strings in the document

Returns:
a List value.

getGazetteerInst

public Gazetteer getGazetteerInst()

setGazetteerInst

public void setGazetteerInst(Gazetteer gazetteerInst)

getTokenIterator

public Iterator getTokenIterator(Document doc,
                                 String annotationSetName)
This method takes the document and the annotationSetName and then creates a interator for the annotations available in the document under the provided annotationSetName

Parameters:
doc -
annotationSetName -
Returns:
an Iterator

GATE
Version 3.1-2270