Next: , Previous: , Up: Modules   [Contents][Index]


5.4 Wordnet

WordNet is a lexical database for the English language, created and maintained at the Cognitive Science Laboratory of Princeton University3. It groups English words into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym sets.

Dico provides a wordnet module for reading WordNet lexical database files. The module relies on libWN, the support library distributed with the WordNet database.

There is a point worth noticing if you plan to use the WordNet library. Normally, the libWN is compiled as a static library with position-dependent code, which makes it difficult (or impossible, on 64-bit architectures) to use from the dynamically-loaded libraries, such as dicod modules. So, first of all you will need to rebuild WordNet so that it contains position-independent code. To do so, change to the WordNet source directory and reconfigure it as follows:

  ./configure CFLAGS=-fPIC [other_options]

where other_options stands for any other options you might wish to pass to configure.

If you are going to run this command in a source directory that has been previously configured, it is advisable to run ‘make distclean’ beforehand.

Debian-based systems provide a package ‘wordnet-dev’, which contains a properly built shared library. However, this library is named ‘libwordnet.so’, instead of the expected ‘libWN.so’. On such systems you will have to use the --with-libWN option to configure, in order to inform it about the change:

  ./configure --with-libWN=wordnet

Argument to this option is the new basename for the libWN library, without file suffix. Optionally, the ‘lib’ prefix is allowed,

The wordnet module is compiled automatically if the configure script was able to find the library and its header file wn.h. If it was not, use the --with-wordnet configure option to specify the location where these files can be found. For example, if WordNet was installed using the default procedure, then the following option will do the job:

  ./configure --with-wordnet=/usr/local/WordNet-3.0

This command tells Dico to look for WordNet library files in /usr/local/WordNet-3.0/lib and for include files in /usr/local/WordNet-3.0/include.

A compiled module is loaded using the following statement:

load-module wordnet {
    command "wordnet [parameters]";
}

Optional parameters are:

wordnet module parameter: wnhome dir

Base directory for WordNet files. This is the directory where WordNet was installed. For the wordnet module to work, it must contain the dict subdirectory with WordNet dictionary files.

If you installed WordNet to /usr/local/WordNet-3.0, so that running ls on that directory shows you:

$ ls /usr/local/WordNet-3.0/
bin/  dict/  doc/  include/  lib/  man/

then you would use

load-module wordnet {
    command "wordnet wnhome=/usr/local/WordNet-3.0";
}
wordnet module parameter: wnsearchdir dir

Directory in which the WordNet database has been installed.

Normally, these values are set at compile time and you won’t need to override them. The use of these parameters may, however, be necessary if the database was moved or installed in a non-standard location.

One or more WordNet database instances can be defined. They all will be sharing the same database. The reason for having several database instances is that they may have different output options. For example, you may configure one database to return word definitions and another one to act as a thesaurus.

Dico version 2.11.90 defines the following database parameters:

wordnet database parameter: pos value

Select part of speech to be displayed by this database. By default, all parts of speech are displayed. Valid values are:

all

Display all parts of speech. This is the default.

noun

Display only nouns.

verb

Display only verbs.

adj
adjective

Display only adjectives.

adv
adverb

Display only adverbs.

satellite
adjsat

Display only satellites.

wordnet database parameter: merge-defs

When specified, this parameter instructs the WordNet database to merge all definitions with the same part of speech into a single definition, which will be returned in the usual dictionary fashion, e.g.:

sail
n. 1. a large piece of fabric (usually canvas fabric) by
means of which wind is used to propel a sailing vessel 
Synonyms: {canvas}, {canvass}, {sheet}
2. an ocean trip taken for pleasure
Synonyms: {cruise}
3. any structure that resembles a sail
v. 1. traverse or travel on (a body of water); "We sailed
the Atlantic"; "He sailed the Pacific all alone" 
2. move with sweeping, effortless, gliding motions

By default, each definition is returned as a separate entry.

As an example, the following is the database definition the author uses on his server:

database {
    name "WordNet";
    handler "wordnet merge-defs";
    languages-from "en";
    languages-to "en";
    description "WordNet dictionary, version 3.0";
}

Footnotes

(3)

See http://wordnet.princeton.edu/wordnet/, for a detailed information, including links to download.


Next: Guile, Previous: Gcide, Up: Modules   [Contents][Index]