germanet
Class GermaNet

java.lang.Object
  extended by germanet.GermaNet

public class GermaNet
extends java.lang.Object

Provides high-level look-up access to GermaNet data. Intended as a read-only resource - no public methods are provided for changing or adding data.

GermaNet is a collection of German lexical units (LexUnits) organized into sets of synonyms (Synsets).
A Synset has a WordClass (adj, nomen, verben) and consists of Lists of LexUnits, Frames, paraphrases (Strings), and Examples. The List of LexUnits is never empty, but any of the others may be.
A LexUnit consists of one or more orthForms (represented as a List of Strings), and has the following attributes: markedStyle (boolean), sense (int), orthVar (boolean), artificial (boolean), properName (boolean), status (String).
A Frame is simply a container for frame data (String).
An Example consists of text (String) and 0 or more Frames.

To construct a GermaNet object, provide the location of the GermaNet data. This can be done with a String representing the path to the directory containing the data, or with a File object:

GermaNet gnet = new GermaNet("/home/myName/germanet/V51_UTF/");
or
File gnetDir = new File("/home/myName/germanet/V51_UTF");
GermaNet gnet = new GermaNet(gnetDir);

The GermaNet class has methods that return Lists of Synsets or LexUnits, given an orthForm or a WordClass. For example,

List<LexUnit> lexList = gnet.getLexUnits("Bank");
List<LexUnit>> verbenLU = gnet.getLexUnits(WordClass.verben);
List<Synset> synList = gnet.getSynsets("gehen");
List<Synset> adjSynsets = gnet.getSynsets(WordClass.adj);

Unless otherwise stated, methods will return an empty List rather than null to indicate that no objects exist for the given request.

Important Note:
Loading GermaNet requires more memory than the JVM allocates by default. Any application that loads GermaNet will most likely need to be run with JVM options that increase the memory allocated, like this:

java -Xms128m -Xmx128m MyApplication

Depending on the memory needs of the application itself, the 128's may need to be changed to 256's or higher.

Version:
1.0
Author:
Marie Hinrichs (meh at sfs.uni-tuebingen.de)

Constructor Summary
GermaNet(java.io.File dir)
          Constructs a new GermaNet object by loading the the data files in the specified directory File.
GermaNet(java.lang.String dirName)
          Constructs a new GermaNet object by loading the the data files in the specified directory path name.
 
Method Summary
 java.lang.String getDir()
          Get the absolute path name of the directory where the GermaNet data files are stored.
 LexUnit getLexUnitByID(java.lang.String id)
          Return the LexUnit with id, or null if it is not found
CAUTION: LexUnit ids are not stable between data releases.
 java.util.List<LexUnit> getLexUnits()
          Return a list of all LexUnits.
 java.util.List<LexUnit> getLexUnits(java.lang.String orthForm)
          Return a List of all LexUnits in which orthForm occurs.
 java.util.List<LexUnit> getLexUnits(WordClass wordClass)
          Return a List of all LexUnits in the specified wordClass.
 int getNumLexUnits()
          Return the number of LexUnits contained in GermaNet.
 int getNumSynsets()
          Return the number of Synsets contained in GermaNet.
 Synset getSynsetByID(java.lang.String id)
          Return the Synset with id, or null if it is not found
CAUTION: Synset ids are not stable between data releases.
 java.util.List<Synset> getSynsets()
          Return a list of all Synsets.
 java.util.List<Synset> getSynsets(java.lang.String orthForm)
          Return a List of all Synsets in which orthForm occurs in one of its LexUnits.
 java.util.List<Synset> getSynsets(WordClass wordClass)
          Return a List of all Synsets in the specified wordClass.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GermaNet

public GermaNet(java.lang.String dirName)
         throws java.io.FileNotFoundException,
                javax.xml.stream.XMLStreamException
Constructs a new GermaNet object by loading the the data files in the specified directory path name.

Parameters:
dirName - the directory where the GermaNet data files are located.
Throws:
java.io.FileNotFoundException
javax.xml.stream.XMLStreamException

GermaNet

public GermaNet(java.io.File dir)
         throws java.io.FileNotFoundException,
                javax.xml.stream.XMLStreamException
Constructs a new GermaNet object by loading the the data files in the specified directory File.

Parameters:
dir - location of the GermaNet data files
Throws:
java.io.FileNotFoundException
javax.xml.stream.XMLStreamException
Method Detail

getDir

public java.lang.String getDir()
Get the absolute path name of the directory where the GermaNet data files are stored.

Returns:
the absolute pathname of the location of the GermaNet data files.

getSynsets

public java.util.List<Synset> getSynsets()
Return a list of all Synsets.

Returns:
a list of all Synsets.

getSynsets

public java.util.List<Synset> getSynsets(java.lang.String orthForm)
Return a List of all Synsets in which orthForm occurs in one of its LexUnits.

Parameters:
orthForm - the orthForm to search for.
Returns:
a List of all Synsets containing orthForm. If no Synsets were found, this is a List containing no Synsets.

getSynsets

public java.util.List<Synset> getSynsets(WordClass wordClass)
Return a List of all Synsets in the specified wordClass.

Parameters:
wordClass - the wordClass, for example WordClass.nomen
Returns:
a List of all Synsets in the specified wordClass. If no Synsets were found, this is a List containing no Synsets.

getSynsetByID

public Synset getSynsetByID(java.lang.String id)
Return the Synset with id, or null if it is not found
CAUTION: Synset ids are not stable between data releases.

Parameters:
id - the ID of the Synset to be found.
Returns:
the Synset with id, or null if it is not found.

getLexUnitByID

public LexUnit getLexUnitByID(java.lang.String id)
Return the LexUnit with id, or null if it is not found
CAUTION: LexUnit ids are not stable between data releases.

Parameters:
id - the ID of the LexUnit to be found.
Returns:
the LexUnit with id, or null if it is not found.

getNumSynsets

public int getNumSynsets()
Return the number of Synsets contained in GermaNet.

Returns:
the number of Synsets contained in GermaNet.

getNumLexUnits

public int getNumLexUnits()
Return the number of LexUnits contained in GermaNet.

Returns:
the number of LexUnits contained in GermaNet.

getLexUnits

public java.util.List<LexUnit> getLexUnits(java.lang.String orthForm)
Return a List of all LexUnits in which orthForm occurs.

Parameters:
orthForm - the orthForm to search for.
Returns:
a List of all LexUnits containing orthForm. If no LexUnits were found, this is a List containing no LexUnits.

getLexUnits

public java.util.List<LexUnit> getLexUnits(WordClass wordClass)
Return a List of all LexUnits in the specified wordClass.

Parameters:
wordClass - the WordClass, (eg WordClass.verben)
Returns:
a List of all LexUnits in the specified wordClass. If no LexUnits were found, this is a List containing no LexUnits.

getLexUnits

public java.util.List<LexUnit> getLexUnits()
Return a list of all LexUnits.

Returns:
a list of all LexUnits.