GATE
Version 3.1-2270

gate.creole.ml.maxent
Class MaxentWrapper

java.lang.Object
  extended by gate.creole.ml.maxent.MaxentWrapper
All Implemented Interfaces:
AdvancedMLEngine, MLEngine, ActionsPublisher

public class MaxentWrapper
extends Object
implements AdvancedMLEngine, ActionsPublisher

Wrapper class for the Maxent machine learning algorithm.

See Also:
Maxent homepage

Nested Class Summary
protected  class MaxentWrapper.LoadModelAction
          This reloads a file that was previously saved using the SaveModelAction class.
protected  class MaxentWrapper.SaveModelAction
          This allows the model, including its parameters to be saved to a file.
 
Field Summary
protected  List actionsList
           
protected  double confidenceThreshold
           
protected  int cutoff
          The following members are set by the part of the config file, and control the parameters used for training the model, and for classifying instances.
protected  boolean datasetChanged
          Marks whether the dataset was changed since the last time the classifier was built.
protected  DatasetDefintion datasetDefinition
           
protected  int iterations
           
protected  opennlp.maxent.MaxentModel maxentClassifier
          The Maxent classifier used by this wrapper
protected  org.jdom.Element optionsElement
          The JDom element contaning the options fro this wrapper.
protected  ProcessingResource owner
           
protected  StatusListener sListener
           
protected  boolean smoothing
           
protected  double smoothingObservation
           
protected  List trainingData
          This List stores all the data that has been collected.
protected  boolean verbose
           
 
Constructor Summary
MaxentWrapper()
          This constructor sets up action list so that these actions (loading and saving models and data) will be available from a context menu in the gui).
 
Method Summary
 void addTrainingInstance(List attributeValues)
          This is called to add a new training instance to the data set collected in this wrapper object.
 List batchClassifyInstances(List instances)
          Some wrappers allow batch classification, but this one doesn't, so if it's ever called just inform the user about this by throwing an exception.
 Object classifyInstance(List attributeValues)
          Decide on the outcome for the instance, based on the values of all the maxent features.
 void cleanUp()
          No clean up is needed for this wrapper, so this is just added because its in the interface.
 List getActions()
          Gets the list of actions that can be performed on this resource.
 DatasetDefintion getDatasetDefinition()
           
 void init()
          Initialises the classifier and prepares for running.
 void load(InputStream is)
          Loads the state of this engine from previously saved data.
 void save(OutputStream os)
          Saves the state of the engine for reuse at a later time.
 void setDatasetDefinition(DatasetDefintion definition)
          Set the data set defition for this classifier.
 void setOptions(org.jdom.Element optionsElem)
          Take a representation of the part of the XML configuration file which corresponds to , and store it.
 void setOwnerPR(ProcessingResource pr)
          Registers the PR using the engine with the engine itself.
 boolean supportsBatchMode()
          Returns true if the engine supports BatchMode, returns false otherwise.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

datasetDefinition

protected DatasetDefintion datasetDefinition

maxentClassifier

protected opennlp.maxent.MaxentModel maxentClassifier
The Maxent classifier used by this wrapper


trainingData

protected List trainingData
This List stores all the data that has been collected. Each item is a List of Strings, each of which is an attribute. In maxent terms, these are the features and the outcome - the position of the outcome can be found by referring to the the datasetDefition object.


optionsElement

protected org.jdom.Element optionsElement
The JDom element contaning the options fro this wrapper.


datasetChanged

protected boolean datasetChanged
Marks whether the dataset was changed since the last time the classifier was built.


actionsList

protected List actionsList

owner

protected ProcessingResource owner

sListener

protected StatusListener sListener

cutoff

protected int cutoff
The following members are set by the part of the config file, and control the parameters used for training the model, and for classifying instances. They are initialised with their default values, but may be changed when setOptions is called.


confidenceThreshold

protected double confidenceThreshold

iterations

protected int iterations

verbose

protected boolean verbose

smoothing

protected boolean smoothing

smoothingObservation

protected double smoothingObservation
Constructor Detail

MaxentWrapper

public MaxentWrapper()
This constructor sets up action list so that these actions (loading and saving models and data) will be available from a context menu in the gui). There is no option to load or save data sets, as maxent does not support this. If there is a need to save data sets, then this can be done using weka.wrapper instead.

Method Detail

cleanUp

public void cleanUp()
No clean up is needed for this wrapper, so this is just added because its in the interface.

Specified by:
cleanUp in interface MLEngine

batchClassifyInstances

public List batchClassifyInstances(List instances)
                            throws ExecutionException
Some wrappers allow batch classification, but this one doesn't, so if it's ever called just inform the user about this by throwing an exception.

Specified by:
batchClassifyInstances in interface MLEngine
Parameters:
instances - This parameter is not used.
Returns:
Nothing is ever returned - an exception is always thrown.
Throws:
ExecutionException

setOptions

public void setOptions(org.jdom.Element optionsElem)
Take a representation of the part of the XML configuration file which corresponds to , and store it.

Specified by:
setOptions in interface MLEngine
Parameters:
optionsElem - the JDom element containing the options from the configuration.
Throws:
GateException

addTrainingInstance

public void addTrainingInstance(List attributeValues)
This is called to add a new training instance to the data set collected in this wrapper object.

Specified by:
addTrainingInstance in interface MLEngine
Parameters:
attributeValues - A list of String objects, each of which corresponds to an attribute value. For boolean attributes the values will be true or false.

setDatasetDefinition

public void setDatasetDefinition(DatasetDefintion definition)
Set the data set defition for this classifier.

Specified by:
setDatasetDefinition in interface MLEngine
Parameters:
definition - A specification of the types and allowable values of all the attributes, as specified in the part of the configuration file.

classifyInstance

public Object classifyInstance(List attributeValues)
                        throws ExecutionException
Decide on the outcome for the instance, based on the values of all the maxent features. N.B. Unless this function was previously called, and there has been no new data added since, the model will be trained when it is called. This could result in calls to this function taking a long time to execute.

Specified by:
classifyInstance in interface MLEngine
Parameters:
attributeValues - A list of all the attributes, including the one that corresponds to the maxent outcome (the attribute). The value of outcome is arbitrary.
Returns:
A string value giving the nominal value of the outcome or, if the outcome is boolean, a java String with value "true" or "false"
Throws:
ExecutionException

init

public void init()
          throws GateException
Initialises the classifier and prepares for running. Before calling this method, the datasetDefinition and optionsElement fields should have been set using calls to the appropriate methods.

Specified by:
init in interface MLEngine
Throws:
GateException - If it is not possible to initialise the classifier for any reason.

load

public void load(InputStream is)
          throws IOException
Loads the state of this engine from previously saved data.

Parameters:
is - An open InputStream from which the model will be loaded.
Throws:
IOException

save

public void save(OutputStream os)
          throws IOException
Saves the state of the engine for reuse at a later time.

Parameters:
os - An open output stream to which the model will be saved.
Throws:
IOException

getActions

public List getActions()
Gets the list of actions that can be performed on this resource.

Specified by:
getActions in interface ActionsPublisher
Returns:
a List of Action objects (or null values)

setOwnerPR

public void setOwnerPR(ProcessingResource pr)
Registers the PR using the engine with the engine itself.

Specified by:
setOwnerPR in interface MLEngine
Parameters:
pr - the processing resource that owns this engine.

getDatasetDefinition

public DatasetDefintion getDatasetDefinition()

supportsBatchMode

public boolean supportsBatchMode()
Description copied from interface: AdvancedMLEngine
Returns true if the engine supports BatchMode, returns false otherwise.

Specified by:
supportsBatchMode in interface AdvancedMLEngine

GATE
Version 3.1-2270