GATE

Reporting bugs

If you think you have found a bug in GATE, please:

Once you are sure you have found a bug, please let the appropriate person know about it. Bugs in the core of GATE or in the plugins bundled with the GATE distribution should be reported using the SourceForge bug tracker. Bugs in third-party plugins should generally be reported to the plugin maintainer in the first instance, so the maintainer can determine whether the bug is in their code or in the GATE core.

Historical Information

The rest of this page contains old bugs from before the bug tracker was available, and is of historical interest only.

Known Bugs

Number Subsystem (core, jape, creole or gui) Reported by Adopted by Reported on (date/OS/JDK and GATE versions) Fixed on (date/GATE version) Description Test method Supporting materials
67 gui hamish valy 8/Mar/2002, NT 4, JDK1.4, rc2 ___fixed_on ___date ___Gversion When creating a document the URL entry field in the parameters dialogue is sometimes blanked out. If you enter the URL, then leave the cursor in that field and click OK, the field is blanked and an error message about the missing URL thrown. Second and subsequent tries will work ok. Workaround:hit return instead of clicking OK. - -
0006 core Valentin ___adopted_by 16/5/1 NT4 1.3.1 a2build499 ___fixed_on ___date ___Gversion AnnotationSetImpl:
Various get() methods return the actual set stored in the indexes internal to AnnotationSetImpl. If such a set obtained from a get() operation is then modified, the modification will be reflected by the results of all the further calls to the same get() method although the base AnnotationSet is not actually modified, hence invalid results.
Probably the desired behaviour is for the returned annotation sets to be backed by the base annotation set so that modifications on the returned sets will reflect in the base set. That could cause a lot of ConcurrentModificationExceptions in the client code but that's their problem. {@see List.subList(start, end).clear(); }
- -
00043 gui - feature request Kalina beta1 ___fixed_on ___date ___Gversion We need separate panels for the key and response annotations in the adiff display, with their own horizontal scrolling, so that we can get both on the same screen when the features lists are long ___test_method ___supporting_materials
00044 gui - feature request Marin ___adopted_by beta1 ___fixed_on ___date ___Gversion Have a different icon for persistent LR's (i.e. documents) that have changed to show the user they need saving. ___test_method ___supporting_materials
00045 3) gui - feature requests Kalina ___adopted_by beta1 ___fixed_on ___date ___Gversion 3. make the action menu for resources a separate class and that gets constructed from actions exposed on the resources themselves. Valy says that's a really good idea and not difficult to implement. The nice thing about it is that it'll reduce the overhead of adding new types of LRs like Wordnet to the GUI with no programming effort on behalf of the user. technical difficulty: medium (Valy knows exactly how but just not enough time till the beta; suggests done in January, before final release). ___test_method ___supporting_materials
00046 core - feature request Kalina ___adopted_by beta1 ___fixed_on ___date ___Gversion The profiler shows that obtaining beanInfo on each class is an expensive operation. I minimised that to an extent now by stopping it from being read for every parameter. Hopefully now it's read once parameters need to be set. However, still this is not needed, because the beanInfo does not change once the class is loaded. So how about making a static on each Resource, which when it gets constructed, it checks and if the static is null -> creates the BeanInfo, if not, does nothing. That might speed up the things even more. Probably not worth trying till January, but something to keep in mind. Valentin's note: how often do we set parameters on resources? Is it worth the while? ___test_method ___supporting_materials
00049 core- I/O -optimisation Kalina ___adopted_by beta1 ___fixed_on ___date ___Gversion
Document.toXML()
needs optimising so it allocates less memory.
___test_method ___supporting_materials
00050 creole Kalina ___adopted_by beta1 ___fixed_on ___date ___Gversion Priority of rule firing in the OrthoMatcher ___test_method ___supporting_materials
00057 Oracle DB / Postgres DB Marin Marin beta1 NOT FIXED resources are not locked during sync (concurrent access errors may occur) ___test_method ___supporting_materials
00065 Oracle DB / Postgres DB Marin Marin beta1 NOT FIXED passwords are kept in plain text in the database ___test_method ___supporting_materials
00067 core - persistence Kalina beta1 ___fixed_on ___date ___Gversion a more intelligent saving logic for when corpora get processed, docs opened, processed, and unloaded. corpus.unloadDocument() should behave smarter than always saving the doc. ___test_method ___supporting_materials
00071 security Marin Marin beta1 NOT FIXED when a new user/group is created with the DBAdmin GUI the user/group is hown in the login dialog for opening an Oracle DS, but an attempt to log in as such will fail - the GATE pplication must be restarted in order to use the new user. Most probably the security factory is getting desynchronized ___test_method ___supporting_materials
00072 security Marin Marin beta1 NOT FIXED security permissions for documents are not checked - any user can read/modify any document ___test_method ___supporting_materials
00073 core Java HTML parsing Angel Angel 13/march/2002
WinNT4
JDK 1.3/1.4
13/march/2002
Should be fixed in JDK, not in GATE
:-)
In javax.swing.text.html.parser.Parser there is one wrong IF statement
on line 1653 for JDK 1.3 and on line 1712 for JDK 1.4.

This statement is: if (elemName.equals("center")) {

The wrong IF statement should be removed but the body of the statement should stay.
You have in code above a less restrictive IF and in some cases
the "font" tag will be processed incorrect.

You could get the Parser.java file from src.zip (in JDK directory)

You should put the fixed Parser.class file in the JDK_dir\jre\lib\rt.jar
If you parse the HTML file:

<HTML>
<FONT SIZE = 2>
<P>
Text with wrong position.
</FONT>
The wrong position start.
</HTML>

You will have an error in the callback handler of HTML parser:
"end.missingfont??"
two pieces of text will be connected in one and the reported position
of text will be wrong - will be the position of second piece of text.
___supporting_materials
00074 Oracle DB Marin Marin gate 2.0 build 834, 14/03/ 2002 NOT FIXED missing foreigh key constraint in the the DDL scripts for Oracle (createTable - T_LANGUAGE_RESOURCE). The FK is present in the model, which implies that something went wrong when the DDL was autogenerated from the model ___test_method ___supporting_materials
00078 Oracle DB / Postgres DB Marin Marin v2.0 build 843 NOT FIXED setSecurityInfo() not implemented for DB DataStores ___test_method ___supporting_materials
00079 Diana gate 2.0 build 846, 25 March 2002 ___fixed_on ___date ___Gversion loading a faulty JAPE grammar (that contains no final "}" on the RHS of a Java rule) causes Gate to never finish loading the grammar (all you can do is kill Gaet completely) ___test_method ___supporting_materials
00080 Oracle DB / Postgres DB Marin Marin v2.0 build 862 NOT FIXED Float feature values cannot be stored int he DB - only Double values are properly stored ___test_method ___supporting_materials
00081 Oracle DB / Postgres DB (?) Marin Marin v2.0 build 862 NOT FIXED Unicode support is buggy. If u store "\u65e5\u672c\u8a9e\u6587\u5b57\u5217" (something Chinese) then u read back something else ___test_method ___supporting_materials
00082 om - feature request Diana The orthomatcher to produce a feature on the annotation which gives info about the orthomatcher rule which has fired (for debugging purposes) ___test_method ___supporting_materials
00084 WordNet UI Marin ? 28 Oct 2002, v2.0 build 928 NOT FIXED The UI is not properly showing hypernym branches - only the first hypernym is traversed upwards (see "cup", sense 1 - the branch for "container" is not shown) ___test_method The underlying GATE WN API seems to work correctly since the unit tests cover such case successfully, i.e. the problem is most probably in the UI
00094 Persistence - Database Marin Marin 23 Jun 2003 Gate 2.1 build 1284 NOT FIXED Removing an annotation from a database set does not trigger removing the underlying nodes if they're not referenced by any other annotation. On the other hand identifying that a node is not needed anymore and deleting it is not easy and will slow down annotation removal - -
00098 WordNet Marin Marin 20 Aug 2003 Gate 2.2 build 1372 NOT FIXED The WordNet implementation does not make any morphological processing (i.e. searches for "running" will return empty result set while the original WordNet GUI will return results as for "run"). MorphologicalProcessor is actually available in JWNL but is not properly used - -
00099 MachineLearningPR Mike 23 June 2004 GATE 3.0-alpha, build 1666 NOT FIXED The machine learning processing resource will not work properly if there is an attribute with a negative position, but no attribute with a positive position. This is probably related to their being a backward cache but no forward one in such a situtation. The bug can be seen by examing the datasets collected. - -

Fixed Bugs

Number Subsystem (core, jape, creole or gui) Reported by Adopted by Reported on (date/OS/JDK and GATE versions) Fixed on (date/GATE version) Description Test method Supporting materials
0001 jape Horacio Valentin April-01 NT4 JDK 1.3 alpha1 30th May 2001
Alpha3 build 512
Infinite loop in jape processing
Jape now detects infinite loops and recovers from them by forcing an advance of the input. This might loose some possible matches but it's better than waiting forever.
A warning message is displayed on the Err output.
directory bugs/0001
0002 gui Hamish Valentin 16/5/1 linux6.2/solaris7 1.3.1 a2-build499 30th May 2001
Alpha3 build 512
When a datastore is openned, and double-clicked on, before any other resources are created, the resources tree resizes to near zero size - -
0003 gui Hamish Cristi 16/5/1 NT4 1.3.1 a2build499 Fixed on 18 May 2001 from alpha2 Build 501 when using larger-than-normal fonts, the annotation diff results at the bottom of the screen sometimes disappear outside the window - -
0004 gui Hamish Valentin 16/5/1 NT4 1.3.1 a2build499 16/Oct/2001/alpha 3 the applications edit component needs some work - it resizes in a strange fashion e.g. when you select a component to add. in general components shouldn't resize when you're clicking in them methinks
The component has been fully redesigned
- -
0005 gui Hamish Valentin 16/5/1 NT4 1.3.1 a2build499 30th May 2001
Alpha3 build 512
it would be nice if double click on a document in the datastore display tree loaded the document, in line with double clicking in the resource tree - -
0007 core Valentin Valentin 16/5/1 NT4 1.3.1 a2build499 1st June 2001
Alpha3 build 517
when it looks for creole.xml in the file system it downcases the whole path (because it comes from a URL?) - -
0008 core Valentin Valentin 16/5/1 NT4 1.3.1 a2build499 26/Feb/2002 beta1 gate.initCreoleRegister is looking in resources/gau for creole.xml when gate.Main starts up - it shouldn't, as this is only for testing - -
0009 creole Kalina Oana 18/May OS:NT Gate a2build499 1st June 2001
Alpha3 build 517
Move the list loading from run to the init method, because they are meant to be read only when the NM is initialised, not every time it's run. - -
0010 creole/namematcher Kalina Oana 18/May OS:NT Gate a2build499 1st June 2001
Alpha3 build 517
I added BT Cellnet and BT Wireless as spurious matches in the spur_match.lst file. However if you the NM on the file which I created for testing \resources\gate\texts\namematcher-test.txt, then the NM still matches them to BT, when it should not. just run in GUI, after the NE recogniser, on file namematcher-test from \resources\gate\texts -
0011 jape Valentin Valentin 12 June 01 ALL:ALL Alpha3 27 Feb 2002 beta1 Jape grammars don't have direct access to the input annotation set. They only receive the list of matched annotations. This makes it impossible to reliably remove some of the matched annotations from the input AS. All the grammars that do this are assuming that the input AS is the same as the output one (which is not guaranteed). ___test_method ___supporting_materials
0012 core Valentin Valentin 13 July 01 ALL:ALL Alpha3 1st June 2001
Alpha3 ALL builds
when it looks for creole.xml for an external Gate resource in the file system it downcases the whole path (because it comes from a URL?) - -
00013 gui Diana Valentin 27 July 01 Linux 1.3.1 Alpha3 19 Feb 2002 beta1 annotation types which are empty do not get automatically removed from the GUI, nor is it possible to remove them manually (with the Delete key)
0014 gui Diana Valentin 02 Aug 01 Linux 1.3.1 Alpha3 Oct/2001/ beta1 When adding new annotations, the popup window which should appear to allow you to select an annotation type etc doesn't always appear. Nothing happens at all, except an error message in the command window of the type "Exception occurred during event dispatching:" ___test_method ___supporting_materials
0015 gui Diana ___adopted_by 06 Aug 01 Linux 1.3.1 Alpha3 no longer happens in RC1 If you manage to create 2 different processing errors/warnings simultaneously (of the type that create pop-up messages), neither pop-up window lets you close it, and nothing can be done until the messages are closed. ___test_method ___supporting_materials
0016 gui Oana Valentin Oct 2001 Solaris 1.3.X Alpha 3 Nov 28 2001 Beta1 The file open dialog on solaris opens with a very large size and displays nothing.
Try to open a file on solaris ___supporting_materials
00017 gui Valentin Valentin 01/12/2001 All platforms All JDKs Beta1 05 Mar 2002 beta1

The GUI doesn't block while operations are being executed. This allows for multiple threads to be run at the same time which could lead to dangerous situations as the gate API is not thread safe. Although runing things that not affect each other is many times safe it is sometimes difficult to tell whether two certain operations are interfering with each other so the GUI should lock itself not allowing the user to start a new operation before the current one finished.

Kalina's note: I think that such operations should not be performed in a separate thread allowing the user to control the GUI. Instead they should pop up a dialog telling the user theu have to wait. The problem otherwise: - I do not know when the save is finished, so I close the document while it's still being saved. What happens is cleanup() gets called, which remoes the data before it's saved.

Workaround: Allow each operation to finish before starting a new one. This is particularly important for operations involving datastores and executions of applications.

___test_method ___supporting_materials
0018 Oracle DB - Marin - Dec 05 2001 Beta1 When documents with changed annotations were synchronised then the annotations were added to the database with reference to the LanguageResource of the document, not the document itself
- -
0019 Oracle DB - Marin - Dec 04 2001 Beta1 Deleting a document was not working for Oracle DataStore - misplaced delete statements caused foreign key violation
- -
0020 Oracle DB - Marin - Dec 04 2001 Beta1 Synchronising a document caused sometimes duplicating an annotation
- -
0021 Oracle DB - Marin - Dec 04 2001 Beta1 adding a document node in the database would use wrong node_id - IDs were not initialised properly when the document was read from DB
- -
00022 Oracle DB - Marin Jan 21 2002 Beta1 build760 Mar 06 2002 rc1

the DatabaseCorpus implemtnation loads the documents from the corpus when the latter is read. Since the DatabaseDocuments do not contain much data (everything is loaded on demand) there is no overhead but the documents appear in the GUI which is not the case with the serial implementatin

- -
00023 Oracle DB - Marin - Jan 21 2002 Beta1 build760

when read from the database, DatabaseDocuments and DatabaseCorpuses were created directly, not via the Factory

- -
00024 core Valentin Valentin 17/Dec/2001 All platforms All JDKs Beta1 19 Feb 2002 beta1 Annotations and coref data are represented independently but they depend upon each other. Create a doc and run ANNIE on it; you should get a bunch of annotations some of which will corefer.
Delete all the annotations and try to show the coref data. It explodes because it tries to find those annotations that were matching but, guess what, they're not there...
___supporting_materials
00025 core Diana Valentin 17/Dec/2001 Beta1 20 Feb 2002 Beta1 Restoring an application from a file creates an endless loop if one of the components is missing (e.g. you've removed a document from where it was originally stored).
00026 gui Kalina Valentin alpha3 27 Feb 2002 Beta1 there is a problem with the OKCancelDialog and the tabbed panel it puts inside. Basically, because the tabbedPabel grows to fit the size of the editors, this creates a problem when these editors exceed the screen. The dialog does not fit on it any more and the tabs are gone, you cannot see anything! All of this is due to the token sets of annotations that you add on the sentence annotations. If easier, maybe change those to annotation Ids instead and avoid tackling the issue, because it only occurs when the annotation has an excessively long feature. Whatever. Run Nerc on a document of choice, then double click any Sentence. At that point, the treeviewer gets constructed on the first pane and Cristi′s annotation viewer gets constructed in the next. However, his viewer does not control its Size and Preferred size, neither it has a scroller. So the tab panel decides to make itself as big as his viewer and nothing shows on screen, because the treeviewer is only small and is not visible, just as the tabs themselves. ___supporting_materials
00027 gui Valentin Valentin beta1 19 Feb 2002 Beta1 DocViewer sizes weirdly when only the text is displayed. ___test_method ___supporting_materials
00028 Subsystem
gui
Reported by Steve Wohlever Valentin Reported on
21/11/2001 Sparc Solaris 7 JDK 1.3.0 and 1.3.1_01 and
2.0b1build691
26/Feb/2002
beta1
When I attempt to load a GATE document language resource via the GATE GUI (i.e., click on the "open folder" icon for the sourceUrl property), my screen is filled with a simple blue background (and this background spans multiple "virtual" desktops in my X-Windows session, using the FVWM window manager). No Java Swing widgets are visible on this background (making it impossible to do anything, much less load a document). I have also tried this with the Sun CDE window manager. In that case, a gray window fills just my current desktop, again with no Java widgets (and thus no way to do anything). This problem does not arise on Windows 2000 (JDK 1.3.1_01) (which is not my preferred platform), and I did not have this problem on Solaris when using GATE 2.0 a3build516. Just run build/gate.sh Supporting materials
00029 gui Valentin Valentin Beta1 19 Feb 2002 beta1 Main view title doesn′t change on resource rename. ___test_method ___supporting_materials
00032 gui Diana Valentin Beta1 4 march 2002; beta1 I tried to change the colour of one of the annotations (by selecting red instead of pale yellow as the background) and to my surprise, it made all those annotations disappear completely! And I couldn′t even change the colour back to something else, as it wouldn′t let me..... Bizarre, as it used to work fine. I tested it several times and got the same thing every time. NOTE: Linux window manager problem, coz happens only for Diana and nobody else. ___test_method ___supporting_materials
00033 core-serialisation Hamish Damyan/Kalina beta1 11-jan-2002 / Beta1 SDS is losing document features after save/restore create doc from http URL note that source URL is a feature open existing SDS "save to" the doc quit gate restart, open SDS, load doc source URL feature no longer present ___supporting_materials
00034 core Marin Damyan/Kalina beta1 11-jan-2002 / Beta1 it looks like the FeatureMap is not working properly the following code: ------------------------------- String strPronoun = (String)currPronoun.getFeatures().get(TOKEN_STRING); System.out.println("key=["+TOKEN_STRING+"]"); System.out.println("value ["+currPronoun.getFeatures().get(TOKEN_STRING)+"]"); System.out.println("features=["+currPronoun.getFeatures()+"]"); ------------------------------- produces: ============================ key=[string] value [null] features=[{category=PRP$, kind=word, string=its, length=3, orth=lowercase}] ============================ i.e. getting the value for the key "string" fails, wlthough it seems that this key/velue pair exists in the feature map the code in feature Map for toString() and get() is totally different, that's why one works and other not more weird details: - the code works in the debugger, and when used standalone (from TestCoref, not as PR from GATE UI) which implies synchronization problems - on the other hand I can't see a scenario where these four consecutive lines of the execution are interfered by another thread ___supporting_materials
00035 core-serialisation Hamish Kalina beta1 27 February 2002; beta1 create a corpus with the attached .txt save to SDS close gate and restart open SDS load corpus try to load document - system dies, evidently doing lots of swapping... (file is 33kb; system started with 200Mb RAM) I could not re-create this bug with this file on an NT or Win2K machine. Assume it's been fixed or due to a problem with the machine/jdk where it ran. ___test_method 17160411.txt
00036 core/jape Hamish Diana beta1 not there; fixed in January or February 2002; beta1 running on the OB set with default resources, I get the exception below. not sure if the preceding exceptions are relevant; the last one happens at line 22 of 1 // FinalYearOnlyFinalActionClass143 2 package japeactionclasses; 3 import java.io.*; 4 import java.util.*; 5 import gate.*; 6 import gate.jape.*; 7 import gate.annotation.*; 8 import gate.util.*; 9 10 public class FinalYearOnlyFinalActionClass143 11 implements java.io.Serializable, RhsAction { 12 public void doit(Document doc, AnnotationSet annotations, java.util.Map bindings) { 13 //removes TempDate annotation, gets the rule feature and adds a new Date annotation 14 gate.AnnotationSet date = (gate.AnnotationSet)bindings.get("date"); 15 gate.Annotation dateAnn = (gate.Annotation)date.iterator().next(); 16 gate.FeatureMap features = Factory.newFeatureMap(); 17 features.put("rule1", dateAnn.getFeatures().get("rule")); 18 features.put("rule2", "YearOnlyFinal"); 19 features.put("kind", "date"); 20 annotations.add(date.firstNode(), date.lastNode(), "Date", 21 features); 22 annotations.removeAll(date); 23 24 } 25 } Diana′s note (unlikely though): The problem is badly written Java code on the RHS of a rule (my fault!) It was trying to get a feature and value ("rule" in this case) that didn't exist 17 features.put("rule1", dateAnn.getFeatures().get("rule")); because of a previous rule in another grammar where the feature should have been added but hadn't been. Of course, I should have ensured that it only looks for features if they exist :-) ___test_method trace
00037 core/gui Diana/Marin Valy bet1 beta1 When you select all the loaded documents and then do Close All, and then exit Gate, it doesn't appear to have removed the documents properly, because it saves them before it exits, and then reloads them when you start Gate again.
same story here and it happens with PRs (sometimes), applications (almost always) and with corpora (almost always, but only an empty corpus is loaded, i.e. containing no documents) I guess it′s relevant to when GATE tries to persist the settings (and that somehow sometimes it fails to do so)
___test_method ___supporting_materials
00038 gui Hamish Valentin beta1 19 Feb 2002 beta1 check out GateExamples, build, and register GateExamples/build as a creole directory then create a doc the new "MyDocViewer" thing comes up as it should; the bug is that it comes up 2 or 3 times ___test_method ___supporting_materials
00039 core - I/O Hamish Kalina beta1 21 Feb 2002 beta1 the "don't give me all the GATE features" facility doesn't seem to work: sometimes you get features, sometimes not check out GateExamples and run the default configuration in JBuilder - results attached. so the "" tag is ok (an extra space added though), but the " output dump
00040 core Marin ___adopted_by beta1 Could not be repeated. Tested on a text file with FBI and the full name. Both got recognised. I was testing something and I noticed that Federal Bureau of Investigation is recognized as organization, while FBI is not, although it is available in the government.lst any ideas? also isn't it possible to automatically populate the organizations lists presented in memory with their acronyms?
00041 core Valentin Hamish beta1 beta1 Closing Gate on solaris raised this: Failed to save config data: gate.util.GateException: problem writing user gate.xml: java.io.FileNotFoundException: Z:\/.gate.xml (No such file or directory) at gate.Gate.writeUserConfig(Gate.java:597) at gate.gui.MainFrame$28.run(MainFrame.java:1747) at java.lang.Thread.run(Thread.java:484) Failed to save session data: java.io.FileNotFoundException: Z:\/.gate.session (No such file or directory) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.(FileOutputStream.java:102) at java.io.FileOutputStream.(FileOutputStream.java:62) at java.io.FileOutputStream.(FileOutputStream.java:132) at gate.util.persistence.PersistenceManager.saveObjectToFile(PersistenceMan ager.java:358) at gate.gui.MainFrame$28.run(MainFrame.java:1783) at java.lang.Thread.run(Thread.java:484) Anyone know why it should look for Z:\ on solaris? ___test_method ___supporting_materials
00042 gui - feature request Kalina Valentin beta1 10 Feb 2002 beta1 When doing corpus annotation with GATE, i.e. correcting ANNIE's wrong annotations, what you often do is want to delete a wrong annotation and then mark it up correctly. The second part was made easy by Cristi. The first part: deleting an annotation is difficult. you have to have the annotations table on, then click on the annotation text, then choose the annotation type, get positioned in the table (which is slow and confusing on long documents) just to press delete. The problem is that linguists don't care about all the internal features produced by GATE most of the time. Suggestion: add a menu of actions for each annotation type (delete and select for now). It'll take longer to explain in writing but Valy agrees it's a good idea to bypass the complex delete procedure. It will not be computationally more expensive and will only take 10 mins to implement (as a pessimist I say 30). ___test_method ___supporting_materials
00045 1) & 2) gui - feature requests Kalina ___adopted_by beta1 ___fixed_on ___date ___Gversion 1. Search - allows you to find a string in the DocumentViewer/Editor. Vitally needed when marking up text. For example, you want to markup all occurences of 'he'. The only way to do that now is to read the entire text. Maybe it should be easier than that. technical difficulty: very low (content.indexOf (given string) + do highlight)
2. Corpus/save preserving format - same as the existing save as XML functionality, just slightly different method call. Just new entry on the menu. technical difficulty: low
___test_method ___supporting_materials
00047 creole/AST Kalina Kalina beta1 21 February 2002 beta1 The idea is to have a file in the corpus directory which is either empty, then process the entire doc or contains the string of the annotation name which should be the BODY annotation in annotation set transfer. The problem is that without AST, precision and recall are lower by about 5% from the real figures. Should fix as soon as Oana has fixed the entire corpus. Diana's note: .....would be very happy later on if it were allowed to process just the main body of the documents and not the whole thing, by allowing you to specify the tag to be used for the annotation set transfer.
00048 core+gui Kalina Valentin alpha3 22 Feb 2002 beta1 WILL MAKE RUNNING THE EVAL TOOLS FROM GUI WORK IN NOVEMBER ___test_method ___supporting_materials
00051 core - I/O - feature request Hamish Kalina beta1 27 February 2002; beta1 We need to enable selective dump of annot features in the format-preserving thingy ___test_method ___supporting_materials
00052 core - I/O Kalina Kalina beta1 21 February 2002, beta1 Preserve format XML export should check for gateId ___test_method ___supporting_materials
00053 creole - orthomatcher Kalina Kalina beta1 19 feb 2002, beta 1 John Smith matches Peter Smith ___test_method ___supporting_materials
00054 core - feature request Kalina Valentin beta1 27 Feb 2002 beta1 Add a ProcessingResourceVR interface for later use. ___test_method ___supporting_materials
00055 gui - feature request Kalina Valentin beta1 20 Feb 2002 beta1 "Restore App from file" as a menu item on right click on Applications ___test_method ___supporting_materials
00056 core Valentin Valentin beta1 beta1 saving an LR to file (as a parameter for a PR for instance also saves the LR's features which it shouldn't as they will be saved by the LR's persistence mechanism. ___test_method ___supporting_materials
00058 creole Hamish Kalina beta1 27 February 2002; beta1 "gate" should be a parameter in DumpPR:
boolean useSuffixForDumpFiles
String suffixForDumpFiles
___test_method ___supporting_materials
00059 core Hamish Valentin beta1 20 Feb 2002 beta1 Corpus Pipeline: if there is an error, just write on Err and go on to the next doc ___test_method ___supporting_materials
00060 GUI Hamish Valentin beta1 20 Feb 2002 beta1 When loading ANNIE without defaults, it should wait for the previous module to load before offering to configure the next. ___test_method ___supporting_materials
00061 GUI Hamish Valentin beta1 20 Feb 2002 beta1 ANNIE non-default load doesn't select the PRs in the controller ___test_method ___supporting_materials
00062 core Hamish Kalina beta1 21 February 2002 beta1 The warning message re. crossover annotations in the XML dump should print the offending annotations. ___test_method ___supporting_materials
00063 core Hamish Oana beta1 27 February 2002 beta1 local newlines in dumped files; a+x on scripts after creation. ___test_method ___supporting_materials
00066 gui - feature request Kalina Kalina beta1 05 march 2002; beta1 AnnotDiff to display the strings of the key and response annotations rather than their type which is already shown in the list above.
00068 core - serialization Marin Kalina beta1 06 march 2002; beta1 1. open serialized corpus (no matter what storage) 2. create new doc 3. add to corpus 4. close new doc 5. sync corpus u end up with: ========= Error reading document inside a serialised corpus. Exception occurred during event dispatching: gate.util.GateRuntimeException: No instance id for class gate.corpora.DocumentImpl at gate.corpora.SerialCorpusImpl.get(SerialCorpusImpl.java:595) at gate.persist.SerialDataStore.sync(SerialDataStore.java:370)
___test_method ___supporting_materials
00069 Oracle DB Marin Marin beta1 06 March 2002 if a document is removed from persistent corpus, the sync() won't properly detach the doc from the corpus in the database i.e. if the corpus is loaded again the document will still be part of it ___test_method ___supporting_materials
00070 creole Diana/Partha Marin beta1 11 March 2002 the test whether a pronoun was contained in a sentence was incorrect for a special boundary case ___test_method ___supporting_materials
00075 Oracle DB Marin Marin gate 2.0 build 834, 14/03/ 2002 ???? the primary key on T_CORPUS need not be composite ___test_method ___supporting_materials
00076 creole Partha Marin v2.0 build 826 18 March 2002, v2.0 build 840 resolution of I/Me/etc throws OutOfBounds exception - the closest index in the quotes array was incorrectly set to -1 in some cases.
java.lang.ArrayIndexOutOfBoundsException
at
gate.creole.coref.PronominalCoref._resolve$I$ME$MY$MYSELF$(PronominalCoref.java:540)
___test_method ___supporting_materials
00077 Oracle DB Marin Marin v2.0 build 842 21 March 2002, v2.0 build 843 login() contained incorrect password check (i.e. sometimes users may log in even with incorrect password) ___test_method ___supporting_materials
00083 core/jape Valentin Valentin 21/08/2002 Win(NT/2K/XP) 1.4 Gate 2.1alpha(895) 30 May 2002 2.1 beta Feature strings (e.g. Token.string) are destroyed in their way to jape. It looks like they are converted to byte[] and back to string using the platform default. This destroys all unicode values.
The problem was actually in the LogArea not in Jape. It's fixed now. The attached application works fine now.
run the attached application i18nBug.zip
00085 coreferencer Diana Marin 29 November, 2002, v2.1 build 1061 2 Dec 2002, build 1085 The annotation "Quoted Text" produced by the pronominal coreferencer should be all one word, because otherwise it can't be used as an Input Annotation type in JAPE (creates an error because it is 2 words) "Quoted Text" changed to "QuotedText"
00086 Postgres DB Valy Marin Dec 2002 08 Jan 2003, build 1121 Documents cannot be deleted from Postgres datastore part of the persist_delete_document() function was missing somehow
00087 core Thomas Karopka Angel 03 Dec 2002 Gate 2.1beta build 958 9 Jan 2003 Gate 2.1-beta1 build 1123 When the document is created from string it is not parsed even when it is in HTML or XML format. The document format could be detected from the document content. Some of detection methods are changed to cover this case. Create GATE document from string with HTML or XML content. See if the result is parsed and only text is extracted or content remains unchanged. If the document is parsed also the Original markup should be created. <HTML> Test <B>parsing</B> of this string. </HTML>
00088 Postgres DB Valy Valy 03 Mar 2003 Gate 2.1 build 1190 03 Mar 2003 Gate 2.1 build 1190 detaching a document from persistent corpus throws an exception when working with Postgres SQL - -
00089 Persistence - Database Valy Marin - 07 Mar 2003 Gate 2.1 build 1191 DatabaseCorpus was not properly handling Unload events, so documents could not be unloaded when processed by PRs - -
00090 Persistence - Database Valy Marin - 07 Mar 2003 Gate 2.1 build 1191 The database datastore did not prperly unload transient documents, part of a transient corpus, when the later was adopted in the database datastore. The result was that each document in the corpus was duplicated in the resource pane (one transiebt and one persistent copy) which is confusing. Now transient documents get properly unloaded when they're adopted and a DatabaseDocument is created instead - -
00091 Persistence - Database Marin Marin - 20 May 2003 Gate 2.1 build 1250 Documents part of database corpus were not properly unloaded. Instead of nullifying the repsective position in the list with corpus documents, the list element was actually removed so effectively the document was removed from the in-memory coprus strucures. As a result the corpus will report that it contains less documents than the correct count and any PR that tries to process the corpus (and loads, processes and unloads documents one by one) will effectively process only half of the focuments (before the in-memory list was emptied) - -
00092 Persistence - Database Marin Marin - 20 May 2003 Gate 2.1 build 1250 Documents with no content were successfully stored in the database but it was impossible to read them back (In the Postgres case documents were assigned incorrect content type) - -
00093 Persistence - Database Marin Marin - 23 Jun 2003 Gate 2.1 build 1284 clear() was not implemented for Database annotation sets. remove() using an iterator returned from a database annotation set was not working properly (relying on the super class' remove()). Both fixed now, though the current implementation of clear() [by default using the iterator's remove() is inefficient for a RDBMS - bulk delete should be perfoemed instead] TODO - -
00095 Persistence - Database Marin Marin - 05 Jul 2003 Gate 2.1 build 1288 Database corpus sync() performance was improved by removing an unnecessary call (for creating a link between a document and a corpus in the database) This call was made for every loaded document from the corpus (even for the ones such link already has been created) - -
00096 Persistence - Database Marin Marin 07 Jul 2003 Gate 2.1 build 1288 08 Jul 2003 Gate 2.1 build 1289 Documents were not properly unloaded from a database corpus because the super class intercepts the unload event (caused by improper listener registration) The actual problem was in the CorpusImpl equals() implementation (which is inherited by DatabaseCorpusImpl) and which makes the two documents ( transient and database) look the same just because they contain the same document list. As a result the registry refused to register the database corpus as listener for document unload events. After the fix the peformance of a database corpus sync() is greatly improved - -
00097 Persistence - Database Angel Marin 03 Jul 2003 Gate 2.1 build 128? 04 Nov 2003 Gate 2.2 build 1484 If you have a Document created from String rather than created by URL there is a persistance exception. Exception is: gate.persist.PersistenceException: can't create document [step 4] in DB: [ORA-01400: cannot insert NULL into ("GATEADMIN"."T_DOCUMENT"."DOC_URL") ORA-06512: at "GATEADMIN.PERSIST", line 310 ORA-06512: at line 1 ] at gate.persist.OracleDataStore.createDoc(OracleDataStore.java:530) at gate.persist.JDBCDataStore.createDocument(JDBCDataStore.java:1302) at gate.persist.JDBCDataStore.createDocument(JDBCDataStore.java:1258)trans failed ...rollback at gate.persist.JDBCDataStore._adopt(JDBCDataStore.java:467) at gate.persist.JDBCDataStore.adopt(JDBCDataStore.java:397) - -
00099 Persistence - Database - Oracle Damyan Marin 03 Nov 2003 Gate 2.2 build 1483 03 Nov 2003 Gate 2.2 build 1483 updating of document content was not working properly sometimes (content ID passed instead of the expected document ID). Almost always the document ID and the content ID are the same (as a result of the document creation process) but under specific conditions this is not true - -