dicodock

Table of Contents

1 Overview

Dicodock is a containerized implementation of GNU dico dictionary server with web interface. The interface can be configured to use the built-in or an external DICT server.

The package doesn't include any dictionary databases, but provides a way to install them automatically at startup. Otherwise, the needed set of dictionaries can be installed manually.

This document describes how to set up dicodock services in various configurations.

2 Introduction

The system consists of the following services:

  • dicoweb

    Runs dicoweb server. By default not accessible from outside.

  • memcached

    Auxiliary memcached service for dicoweb. Not accessible from outside.

  • HTTP server

    This serves static resources and acts as a proxy to dicoweb. By default, accessible as 127.0.0.1:8080. Use environment to override this default.

    Two http services are provided: one uses nginx and another lihttpd. You need only one of them. Use the --profile option to docker compose to select the one you chose.

  • dicod

    Runs dicod server. By default not accessible from outside.

3 Prerequisites

To set up dicodock, you will need the following:

  1. Working internet connection,
  2. git,
  3. a reasonably new docker
  4. docker compose.

The presence of GNU make is highly advisable, although not strictly necessary.

4 For the impatient

First, clone the repository:

git clone https://git.gnu.org.ua/dico/dicodock.git

Then, change to the dicodock directory and run

make build

This will pull in the submodules (if necessary) and build service images.

Once done, run

make up

This will bring the system up. Upon startup, the system will download and install freedict dictionaries, the GNU Collaborative International Dictionary of English, and WordNet dictionary. These dictionaries will be installed in subdirectories of the dicod_db directory that will be created in the current working directory (see the description of the DICOD_DB_VOL variable below). Another directory that will be created in CWD is dicod_include (see the description of the DICOD_INCLUDE_VOL variable).

After startup the dictionary server can be accessed at http://localhost:8080. If port 8080 is already in use, you can specify another one in the DICOWEB_PORT variable, e.g.:

make DICOWEB_PORT=8089 up

To see the status of the dicodock services, run

make ps

or

make status

To bring services down, use

make down

See the section entitled Make Commands for a detailed discussion of these and other commands.

5 Quick start

This section discusses a more elaborate setup, in which the dicoweb interface is made available to the outside world. You will need a domain name and a proxy server installed on the host server. For the purpose of this example, we will assume the domain name dicoweb.example.com.

Change to the top-level dicodock directory. If it is a fresh clone of the repository, run

make build

Then, create the file .env with the following content:

DICOD_INITDB=all
DICOWEB_NAME=dicoweb.example.com
DICOWEB_ADMIN="Dictionary administrator <root@example.com>"
DICOWEB_PORT=8080
PROFILE=nginx

The first line instructs the system to install all dictionaries at startup. This will be done only once. The second line informs the system about the domain name it should make itself available at. Line three provides a contact email for reports. Line four sets the port number or (IP address and port number) on which the server will be available. If port 8080 is already in use on your system, use another number. Finally, the PROFILE setting in line 5 specifies the built-in httpd server to use for serving static content: nginx or lighttpd.

Once done, save the file and run

make up

To make the dictionary accessible from the Internet, you will need to configure a proxy or http server on the host server, which will reverse proxy all requests coming to dicoweb.example.com to 127.0.0.1:8080. The simplest configuration for Apache server will be:

<VirtualHost *:80>
    ServerName    dicoweb.example.com
    ProxyPreserveHost On
    ProxyPass     / http://127.0.0.1:8080
</VirtualHost>

See the section 10.1, for a detailed discussion of the .env file and available variables.

6 Dictionaries

Unless the database directory is already populated with some dictionaries, dicodock will try to download and install the ones listed in the DICOD_INITDB variable. Its value is a whitespace-delimited list of dictionary specifications. Available dictionary specifications are:

  • freedict [=pat-list]

    Freedict dictionaries. Optional pat-list is a comma-separated list of dictionary name glob(7) patterns. If supplied, only those dictionaries that match one of the patterns will be installed. The naming scheme for Freedict dictionaries is L1-L2, where L1 and L2 are three-letter language name abbreviations. For example, eng-fra stands for English-French ( Français ) and deu-eng stands for German ( Deutsch ) - English. Thus, to install only dictionaries for translating to English, use

    freedict=*-eng
    

    To install all English dictionaries (translating both ways), use

    freedict=*-eng,eng-*
    
  • gcide [=version]

    The GNU Collaborative International Dictionary of English. Optional version supplies the version number of the dictionary to download. It defaults to the latest available version.

  • wordnet

    WordNet lexical database.

  • all A shortcut for freedict gcide wordnet.

6.1 Manual database installation

The preferred way to install the needed dictionaries is using the DICOD_INITDB variable. If, however, you prefer to install them manually, follow the instructions below:

  • Freedict dictionaries

    These are free dictionaries available for download from https://download.freedict.org/dictionaries. Each dictionary is a tar archive named freedict-L1-L2-V.dictd.tar.xz, where L1 and L2 are language abbreviations and V is its version number.

    Download the needed dictionaries and untar them to the subdirectory freedict under DICOD_DB_VOL directory. For example, supposing that you use the default DICOD_DB_VOL directory and need to install Breton-French dictionary:

    mkdir dicod_db/freedict
    wget https://download.freedict.org/dictionaries/bre-fra/0.8.3/freedict-bre-fra-0.8.3.dictd.tar.xz
    tar -C dicod_db/freedict -xf freedict-bre-fra-0.8.3.dictd.tar.xz
    
  • GCIDE

    The GNU Collaborative International Dictionary of English is available for download from https://ftp.gnu.org/gnu/gcide. Dicod will look for it in the subdirectory gcide of the database directory. So, to install it, do

    mkdir dicod_db/gcide
    wget https://ftp.gnu.org/gnu/gcide/gcide-0.53.tar.xz
    tar -C dicod_db/gcide --strip -xf gcide-0.53.tar.xz
    
  • WordNet

    WordNet is a large lexical database of English, available from https://wordnet.princeton.edu/. Dicod will look for the WordNet dictionary database in the subdirectory wordnet of the database directory. You will need only database files, not the entire package. Download them from https://wordnet.princeton.edu/download/current-version and extract to the directory wordnet, e.g.:

    mkdir dicod_db/wordnet
    wget https://wordnetcode.princeton.edu/wn3.1.dict.tar.gz
    tar -C dicod_db/wordnet --strip -xf wn3.1.dict.tar.gz
    

If you plan to use another databases, write a configuration file that loads the necessary modules and declares your databases and place it in the custom configuration directory, indicated by the DICOD_INCLUDE_VOL variable. Make sure the file name ends with .conf.

7 Make Commands

The provided makefile is designed to facilitate the use of the system. The following commands are available:

  • make bootstrap

    Bootrstrap the system. This is normally done once, after cloning dicodock from the repository.

  • make config

    Create a docker compose configuration file and dump it on the standard output.

  • make build

    Build all containers

  • make up

    Bring the system up. If the file .env exists in the top-level directory, it will be used. If not, the command acts as if the following file were used:

    DICOD_INITDB=all
    DICOWEB_NAME=localhost
    PROFILE=nginx
    
  • make down

    Bring the system down. This stops the services and removes the containers.

  • make ps

    Show running services.

  • make restart

    Restart the system

These command verbs map to the corresponding docker compose commands. You can supply additional arguments and/or options for each particular command using the verb_FLAGS variable, where the word verb stands for the command verb. For example, to supply options to the build command, use build_FLAGS variable. Notice, that the up_FLAGS variable is initialized with -d by default.

Additionally, the value of F variable is passed to all commands. It is convenient to use as a shortcut from the command line. For example, to get logs in follow mode, use

make logs F=-f

Similarly, to follow logs of the dicod service, do:

make logs F='-f dicod'

The following commands, where SRV stands for the service name, apply the command to that particular service:

  • start- SRV

    Start the service SRV.

  • stop- SRV

    Stop the service.

  • restart- SRV

    Restart the service.

  • ps- SRV

    Show ps output for this service.

  • logs- SRV

    Show logs for service SRV.

Everything said above about the verb_FLAGS and F variables, applies to these commands too, e.g.:

make logs-dicod F=-f

8 Make customization

If the file config.mk is present in the current working directory, it will be included at the beginning of the main makefile. It is intended to provide site-specific settings, such as command-specific flags (see the description of verb_FLAGS, above), terminal settings (see the ANSI variable) and the like.

9 Using without GNU Make

The following instructions apply if you don't have GNU Make installed.

After cloning the repository, run

git submodule update --init --recursive

in order to pull all dependencies.

Once done, create the environment file .env with the necessary settings. To start the system, do

docker compose --profile=nginx up -d

Use --profile=nginx to deploy nginx as the http server, or --profile=lighttpd to deploy lighttpd. If you chose to use host server http server as proxy, omit the --profile option altogether.

Further on, use docker compose subcommands to manage the system. For example, to list running services:

docker compose --profile=nginx ps

To show logs of the dicoweb service in follow mode:

docker compose logs -f dicoweb

To bring all services down:

docker compose --profile=nginx down

and so on.

10 Configuration

The main docker-compose.yml file was designed in such a way as to allow for a wide variety of possible configurations. Most configuration settings are supplied via environment variables in the .env file. To modify service configurations in a more essential way, use the docker-compose.override.yml file. The discussion below addresses both methods.

10.1 Environment file

The file .env in docker directory contains settings that affect basic functionality. These are:

  • ANSI

    When set, the value of this variable is passed to docker compose with the --ansi option. In addition, if set to never the DOCKER_BUILDKIT environment variable is set to 0. Use this when debugging your configuration.

  • DICOD_INITDB

    A list of dictionary databases to download and install at startup. This variable is consulted only if no dictionaries are installed in the database directory. Its value is a whitespace-delimited list of dictionary classes or the word all standing for all available dictionaries. For the list of valid dictionary classes, see the Dictionaries section, above.

  • DICOWEB_HOST

    Canonical hostname of the dicoweb server. This variable must be defined.

  • DICOWEB_PORT

    Port on which dicoweb HTTP server will be available on host machine. Defaults to 127.0.0.1:8080.

  • DICOWEB_DEBUG

    Set this variable to any non-empty value to enable debugging of the dicoweb application. Don't use it on production servers.

  • DICOWEB_ADMIN

    Email address of the the server administrator. Allowed formats:

    Ty Coon <coon@example.org>
    coon@example.org
    

    Default is: root@localhost. You are advised to always define this variable to an existing email address.

  • DICOWEB_SECRET_KEY

    Secret key for the dicoweb application. Default: automatically generated.

  • DICT_SERVER

    Name (and optionally, port number) of the dict server to use. This variable is used by the default settings_docker.py configuration file. It defaults to the internal hostname of the service running dicod. Use this variable if: (1) you use the default settings_docker.py (or a customized version thereof, which retains the default DICT_SERVERS setting), and (2) you want to use external dictionary server.

  • DICOD_INCLUDE_VOL

    Name of the host directory or docker volume to mount to /etc/dicod/include directory in the dicod container. This directory is scanned for dico custom configuration files (dictionary database definitions and the like).

    Default is ./dicod_include.

  • DICOD_DB_VOL

    Name of the host directory or docker volume to mount to /var/dicodb directory in the dicod container. This directory is supposed to keep dictionary databases.

    Default is ./dicod_db.

10.2 Overrides

The file docker/docker-compose.override.yml extends the default docker-compose.yml and provides thus a mechanism for tuning all aspects of the system, including mounting host directories to containers.

11 Common configurations

This section discusses some commonly used configurations and provides working recipes for them.

11.1 Store dictionaries in a docker volume

Assume volume name dicod_db

Create a docker-compose.override.yml file with the following content:

volumes:
  dicod_db:

Add this definition to your .env file:

DICOD_DB_VOL=dicod_db

To keep dicod include directory in a volume, follow the steps above, but use the DICOD_INCLUDE_VOL variable instead.

11.2 Make dicod server world-visible

Let's assume you want dicod to be accessible on the standard port 2628. To do so, create the file docker-compose.override.yml with the following content:

services:
  dicod:
    ports:
      - 2628:2628

11.3 Log to syslog

By default all diagnostic goes to container logs, which obviously is not well suited for production environment. To direct it to syslog instead, define the environment variable PIES_SYSLOG_SERVER to the IP address and UDP port number of your syslog server, e.g. in your .env file:

PIES_SYSLOG_SERVER=172.31.255.252:514

Notice, that the use of port number is suggested because (at least at the time of this writing) the GNU pies container lacks /etc/services file, which pies uses to deduce port numbers.

The above will make all daemons in the containers (dicod, uwsgy, lighttpd, etc.) send their diagnostics to this syslog server over UDP.

To additionally direct all service logging output to syslog, use the following technique. Create the docker-compose.override.yml file with the following:

x-dicoweb-logging:
  &dicoweb-logging
  driver: "syslog"
  options:
    syslog-address: udp://$PIES_SYSLOG_SERVER
    syslog-facility: local0
    tag: '{{if (index .ContainerLabels "com.docker.compose.project")}}{{index .ContainerLabels "com.docker.compose.project"}}/{{end}}{{if (index .ContainerLabels "com.docker.compose.service")}}{{index .ContainerLabels "com.docker.compose.service"}}/{{end}}{{.Name}}/{{.ID}}'

services:
  dicoweb:
    logging: *dicoweb-logging

Add logging: *dicoweb-logging stanza to each service definition in the services: section. This will make sure that everything that goes to standard output or standard error in each container will be sent to syslog with the tag composed of project name, service name, and container ID, delimited by slashes. If you run rsyslog, you can use that tag to route messages to the right place.

Please note, that at the time of this writing the nginx and memcached services do not support syslog logging directly, and the above technique should be used to log their diagnostics to syslog.

11.4 Proxy to dicoweb directly

If your host runs an HTTP server (as opposed to proxy), you might wish it to handle incoming dicoweb requests and get rid of the HTTP service. To implement such setup, your HTTP server should serve static files itself and forward the rest of requests to the dicoweb container.

Dicoweb static assets reside in docker volumes dicoweb_static and dicoweb_run. To make sure your server is able to access them, you need to mount these volumes to suitable places in your file system hierarchy (since the default permissions of /var/lib/docker/volumes, where they are mounted by default, don't allow access to them). You will need local-persist docker storage plugin to do so.

First, create two directories for mapping to dicoweb_static and dicoweb_run volumes. Let's assume they are /mnt/dicoweb_static and /mnt/dicoweb_run:

mkdir /mnt/dicoweb_static /mnt/dicoweb_run

Then, create the file docker-compose.override.yml with the following content:

services:
  dicoweb:
    ports:
      - "127.0.0.1:${DICOWEB_PORT:-8080}:80"
volumes:
  dicoweb_static:
    driver: local-persist
    driver_opts:
  mountpoint: "/mnt/dicoweb_static"
  dicoweb_run:
    driver: local-persist
    driver_opts:
  mountpoint: "/mnt/dicoweb_run"

If using GNU Make, ensure that your .env file does not define PROFILE variable and start up the system. Otherwise, if using docker compose directly, start up the system as follows:

docker compose up -d

If you run Apache web server, modify your virtual host configuration as follows:

<VirtualHost *:80>
    ServerName    dicoweb.example.com
    ProxyPreserveHost On

    <Location />
	ProxyPass       http://127.0.0.1:8080/
    </Location>

    <Location /static>
	ProxyPass !
    </Location>
    Alias /static /home/gray/src/dico/dictdbtest/vol/static

    <Location /favicon.ico>
	ProxyPass !
    </Location>
    <Location /robots.txt>
	ProxyPass !
    </Location>
    Alias /favicon.ico /mnt/dicoweb_static/images/gnu-head-mini.png
    Alias /robots.txt /mnt/dicoweb_run/static/robots.txt

    <Directory /mnt/dicoweb_static>
	AllowOverride All
	Options None
	Require all granted
    </Directory>
    <Directory /mnt/dicoweb_run>
	AllowOverride All
	Options None
	Require all granted
    </Directory>
</VirtualHost>

11.5 Use external dict server

To use an external dict server you need to supply its address to dicoweb and exclude the dicod container from the startup sequence.

The first part is achieved by setting the variable DICT_SERVER in your .env file, e.g.:

DICT_SERVER=dico.gnu.org.ua

The second part is achieved by modifying the dicod service in your docker-compose.override.yml as shown below:

services:
  dicod:
    profiles:
      - disabled

The actual profile name doesn't matter. What's important is that you are never going to use it in the --profile option when running docker compose.

12 Copyright

Copyright (C) 2023-2024 Sergey Poznyakoff

Permission is granted to anyone to make or distribute verbatim copies of this document as received, in any medium, provided that the copyright notice and this permission notice are preserved, thus giving the recipient permission to redistribute in turn.

Permission is granted to distribute modified versions of this document, or of portions of it, under the above conditions, provided also that they carry prominent notices stating who last changed them.

Author: Sergey Poznyakoff

Created: 2024-10-19 Sat 16:45

Validate