dicodock
Table of Contents
1 Overview
Dicodock is a containerized implementation of GNU dico dictionary server with web interface. The interface can be configured to use the built-in or an external DICT server.
The package doesn't include any dictionary databases, but provides a way to install them automatically at startup. Otherwise, the needed set of dictionaries can be installed manually.
This document describes how to set up dicodock services in various configurations.
2 Introduction
The system consists of the following services:
dicoweb
Runs
dicoweb
server. By default not accessible from outside.memcached
Auxiliary
memcached
service fordicoweb
. Not accessible from outside.HTTP
serverThis serves static resources and acts as a proxy to
dicoweb
. By default, accessible as 127.0.0.1:8080. Use environment to override this default.Two
http
services are provided: one uses nginx and another lihttpd. You need only one of them. Use the--profile
option todocker compose
to select the one you chose.dicod
Runs dicod server. By default not accessible from outside.
3 Prerequisites
To set up dicodock, you will need the following:
- Working internet connection,
- git,
- a reasonably new docker
- docker compose.
The presence of GNU make is highly advisable, although not strictly necessary.
4 For the impatient
First, clone the repository:
git clone https://git.gnu.org.ua/dico/dicodock.git
Then, change to the dicodock
directory and run
make build
This will pull in the submodules (if necessary) and build service images.
Once done, run
make up
This will bring the system up. Upon startup, the system will download
and install freedict dictionaries, the
GNU Collaborative International Dictionary of English,
and WordNet dictionary. These
dictionaries will be installed in subdirectories of the dicod_db
directory
that will be created in the current working directory (see the
description of the DICOD_DB_VOL variable
below). Another directory that will be created in CWD
is
dicod_include
(see the description of the
DICOD_INCLUDE_VOL variable).
After startup the dictionary server can be accessed at
http://localhost:8080
. If port 8080 is already in use, you can
specify another one in the DICOWEB_PORT
variable, e.g.:
make DICOWEB_PORT=8089 up
To see the status of the dicodock services, run
make ps
or
make status
To bring services down, use
make down
See the section entitled Make Commands for a detailed discussion of these and other commands.
5 Quick start
This section discusses a more elaborate setup, in which the dicoweb
interface is made available to the outside world. You will need a
domain name and a proxy server installed on the host server. For the
purpose of this example, we will assume the domain name dicoweb.example.com
.
Change to the top-level dicodock
directory. If it is a fresh clone of
the repository, run
make build
Then, create the file .env
with the following content:
DICOD_INITDB=all DICOWEB_NAME=dicoweb.example.com DICOWEB_ADMIN="Dictionary administrator <root@example.com>" DICOWEB_PORT=8080 PROFILE=nginx
The first line instructs the system to install all dictionaries at
startup. This will be done only once. The second line informs the
system about the domain name it should make itself available at. Line
three provides a contact email for reports. Line four sets the port
number or (IP address and port number) on which the server will be
available. If port 8080 is already in use on your system, use another
number. Finally, the PROFILE
setting in line 5 specifies the
built-in httpd server to use for serving static content: nginx or lighttpd.
Once done, save the file and run
make up
To make the dictionary accessible from the Internet, you will need to
configure a proxy or http server on the host server, which will reverse
proxy all requests coming to dicoweb.example.com
to 127.0.0.1:8080.
The simplest configuration for Apache server will be:
<VirtualHost *:80> ServerName dicoweb.example.com ProxyPreserveHost On ProxyPass / http://127.0.0.1:8080 </VirtualHost>
See the section 10.1,
for a detailed discussion of the .env
file and available variables.
6 Dictionaries
Unless the database directory is already populated with some
dictionaries, dicodock
will try to download and install the ones
listed in the DICOD_INITDB variable. Its value
is a whitespace-delimited list of dictionary specifications.
Available dictionary specifications are:
freedict
[=pat-list]Freedict dictionaries. Optional pat-list is a comma-separated list of dictionary name glob(7) patterns. If supplied, only those dictionaries that match one of the patterns will be installed. The naming scheme for Freedict dictionaries is L1-L2, where L1 and L2 are three-letter language name abbreviations. For example,
eng-fra
stands for English-French ( Français ) anddeu-eng
stands for German ( Deutsch ) - English. Thus, to install only dictionaries for translating to English, usefreedict=*-eng
To install all English dictionaries (translating both ways), use
freedict=*-eng,eng-*
gcide
[=version]The GNU Collaborative International Dictionary of English. Optional version supplies the version number of the dictionary to download. It defaults to the latest available version.
wordnet
WordNet lexical database.
all
A shortcut forfreedict gcide wordnet
.
6.1 Manual database installation
The preferred way to install the needed dictionaries is using the
DICOD_INITDB
variable. If, however, you prefer to install them
manually, follow the instructions below:
Freedict dictionaries
These are free dictionaries available for download from https://download.freedict.org/dictionaries. Each dictionary is a tar archive named
freedict-L1-L2-V.dictd.tar.xz
, where L1 and L2 are language abbreviations and V is its version number.Download the needed dictionaries and untar them to the subdirectory freedict under DICOD_DB_VOL directory. For example, supposing that you use the default
DICOD_DB_VOL
directory and need to install Breton-French dictionary:mkdir dicod_db/freedict wget https://download.freedict.org/dictionaries/bre-fra/0.8.3/freedict-bre-fra-0.8.3.dictd.tar.xz tar -C dicod_db/freedict -xf freedict-bre-fra-0.8.3.dictd.tar.xz
GCIDE
The GNU Collaborative International Dictionary of English is available for download from https://ftp.gnu.org/gnu/gcide. Dicod will look for it in the subdirectory
gcide
of the database directory. So, to install it, domkdir dicod_db/gcide wget https://ftp.gnu.org/gnu/gcide/gcide-0.53.tar.xz tar -C dicod_db/gcide --strip -xf gcide-0.53.tar.xz
WordNet
WordNet is a large lexical database of English, available from https://wordnet.princeton.edu/. Dicod will look for the WordNet dictionary database in the subdirectory
wordnet
of the database directory. You will need only database files, not the entire package. Download them from https://wordnet.princeton.edu/download/current-version and extract to the directorywordnet
, e.g.:mkdir dicod_db/wordnet wget https://wordnetcode.princeton.edu/wn3.1.dict.tar.gz tar -C dicod_db/wordnet --strip -xf wn3.1.dict.tar.gz
If you plan to use another databases, write a configuration file that
loads the necessary modules and declares your databases and place it
in the custom configuration directory, indicated by the
DICOD_INCLUDE_VOL variable. Make
sure the file name ends with .conf
.
7 Make Commands
The provided makefile is designed to facilitate the use of the system. The following commands are available:
make bootstrap
Bootrstrap the system. This is normally done once, after cloning dicodock from the repository.
make config
Create a docker compose configuration file and dump it on the standard output.
make build
Build all containers
make up
Bring the system up. If the file
.env
exists in the top-level directory, it will be used. If not, the command acts as if the following file were used:DICOD_INITDB=all DICOWEB_NAME=localhost PROFILE=nginx
make down
Bring the system down. This stops the services and removes the containers.
make ps
Show running services.
make restart
Restart the system
These command verbs map to the corresponding docker compose
commands. You can supply additional arguments and/or options for each
particular command using the verb_FLAGS
variable, where the word
verb stands for the command verb. For example, to supply options to
the build
command, use build_FLAGS
variable. Notice, that the
up_FLAGS
variable is initialized with -d by default.
Additionally, the value of F
variable is passed to all commands. It
is convenient to use as a shortcut from the command line. For
example, to get logs in follow mode, use
make logs F=-f
Similarly, to follow logs of the dicod
service, do:
make logs F='-f dicod'
The following commands, where SRV stands for the service name, apply the command to that particular service:
start-
SRVStart the service SRV.
stop-
SRVStop the service.
restart-
SRVRestart the service.
ps-
SRVShow ps output for this service.
logs-
SRVShow logs for service SRV.
Everything said above about the verb_FLAGS
and F
variables,
applies to these commands too, e.g.:
make logs-dicod F=-f
8 Make customization
If the file config.mk
is present in the current working directory,
it will be included at the beginning of the main makefile. It is
intended to provide site-specific settings, such as command-specific
flags (see the description of verb_FLAGS
, above), terminal settings
(see the ANSI variable) and the like.
9 Using without GNU Make
The following instructions apply if you don't have GNU Make installed.
After cloning the repository, run
git submodule update --init --recursive
in order to pull all dependencies.
Once done, create the environment file .env
with the necessary settings. To start the system, do
docker compose --profile=nginx up -d
Use --profile=nginx
to deploy nginx
as the http server, or
--profile=lighttpd
to deploy lighttpd
. If you chose to use
host server http server
as proxy, omit the --profile
option altogether.
Further on, use docker compose
subcommands to manage the system.
For example, to list running services:
docker compose --profile=nginx ps
To show logs of the dicoweb
service in follow mode:
docker compose logs -f dicoweb
To bring all services down:
docker compose --profile=nginx down
and so on.
10 Configuration
The main docker-compose.yml
file was designed in such a way as to allow for
a wide variety of possible configurations. Most configuration
settings are supplied via environment variables in the .env
file.
To modify service configurations in a more essential way, use the
docker-compose.override.yml
file. The discussion below addresses
both methods.
10.1 Environment file
The file .env
in docker directory contains settings that affect basic
functionality. These are:
-
When set, the value of this variable is passed to
docker compose
with the--ansi
option. In addition, if set tonever
theDOCKER_BUILDKIT
environment variable is set to 0. Use this when debugging your configuration. -
A list of dictionary databases to download and install at startup. This variable is consulted only if no dictionaries are installed in the database directory. Its value is a whitespace-delimited list of dictionary classes or the word
all
standing for all available dictionaries. For the list of valid dictionary classes, see the Dictionaries section, above. DICOWEB_HOST
Canonical hostname of the
dicoweb
server. This variable must be defined.-
Port on which
dicoweb
HTTP server will be available on host machine. Defaults to 127.0.0.1:8080. DICOWEB_DEBUG
Set this variable to any non-empty value to enable debugging of the
dicoweb
application. Don't use it on production servers.DICOWEB_ADMIN
Email address of the the server administrator. Allowed formats:
Ty Coon <coon@example.org> coon@example.org
Default is:
root@localhost
. You are advised to always define this variable to an existing email address.DICOWEB_SECRET_KEY
Secret key for the dicoweb application. Default: automatically generated.
DICT_SERVER
Name (and optionally, port number) of the
dict
server to use. This variable is used by the defaultsettings_docker.py
configuration file. It defaults to the internal hostname of the service runningdicod
. Use this variable if: (1) you use the defaultsettings_docker.py
(or a customized version thereof, which retains the defaultDICT_SERVERS
setting), and (2) you want to use external dictionary server.-
Name of the host directory or docker volume to mount to
/etc/dicod/include
directory in thedicod
container. This directory is scanned fordico
custom configuration files (dictionary database definitions and the like).Default is
./dicod_include
. -
Name of the host directory or docker volume to mount to
/var/dicodb
directory in the dicod container. This directory is supposed to keep dictionary databases.Default is
./dicod_db
.
10.2 Overrides
The file docker/docker-compose.override.yml extends the default
docker-compose.yml
and provides thus a mechanism for tuning all
aspects of the system, including mounting host directories to
containers.
11 Common configurations
This section discusses some commonly used configurations and provides working recipes for them.
11.1 Store dictionaries in a docker volume
Assume volume name dicod_db
Create a docker-compose.override.yml
file with the following content:
volumes: dicod_db:
Add this definition to your .env
file:
DICOD_DB_VOL=dicod_db
To keep dicod
include directory in a volume, follow the steps above,
but use the DICOD_INCLUDE_VOL
variable instead.
11.2 Make dicod server world-visible
Let's assume you want dicod
to be accessible on the standard port 2628.
To do so, create the file docker-compose.override.yml
with the following
content:
services: dicod: ports: - 2628:2628
11.3 Log to syslog
By default all diagnostic goes to container logs, which obviously is
not well suited for production environment. To direct it to syslog
instead, define the environment variable PIES_SYSLOG_SERVER
to the IP
address and UDP port number of your syslog
server, e.g. in your .env
file:
PIES_SYSLOG_SERVER=172.31.255.252:514
Notice, that the use of port number is suggested because (at least at
the time of this writing) the GNU pies container lacks /etc/services
file, which pies
uses to deduce port numbers.
The above will make all daemons in the containers (dicod
, uwsgy
,
lighttpd
, etc.) send their diagnostics to this syslog server over UDP.
To additionally direct all service logging output to syslog
, use the
following technique. Create the docker-compose.override.yml
file with
the following:
x-dicoweb-logging: &dicoweb-logging driver: "syslog" options: syslog-address: udp://$PIES_SYSLOG_SERVER syslog-facility: local0 tag: '{{if (index .ContainerLabels "com.docker.compose.project")}}{{index .ContainerLabels "com.docker.compose.project"}}/{{end}}{{if (index .ContainerLabels "com.docker.compose.service")}}{{index .ContainerLabels "com.docker.compose.service"}}/{{end}}{{.Name}}/{{.ID}}' services: dicoweb: logging: *dicoweb-logging
Add logging: *dicoweb-logging
stanza to each service definition in
the services:
section. This will make sure that everything that
goes to standard output or standard error in each container will be sent to
syslog with the tag composed of project name, service name, and
container ID, delimited by slashes. If you run rsyslog
, you can use
that tag to route messages to the right place.
Please note, that at the time of this writing the nginx
and memcached
services do not support syslog logging directly, and the above
technique should be used to log their diagnostics to syslog
.
11.4 Proxy to dicoweb directly
If your host runs an HTTP server (as opposed to proxy), you might wish
it to handle incoming dicoweb
requests and get rid of the HTTP service.
To implement such setup, your HTTP server should serve static
files itself and forward the rest of requests to the dicoweb
container.
Dicoweb
static assets reside in docker volumes dicoweb_static
and
dicoweb_run
. To make sure your server is able to access them, you
need to mount these volumes to suitable places in your file system
hierarchy (since the default permissions of /var/lib/docker/volumes
,
where they are mounted by default, don't allow access to them). You
will need local-persist
docker storage plugin to do so.
First, create two directories for mapping to dicoweb_static
and
dicoweb_run
volumes. Let's assume they are /mnt/dicoweb_static
and
/mnt/dicoweb_run
:
mkdir /mnt/dicoweb_static /mnt/dicoweb_run
Then, create the file docker-compose.override.yml
with the following
content:
services: dicoweb: ports: - "127.0.0.1:${DICOWEB_PORT:-8080}:80" volumes: dicoweb_static: driver: local-persist driver_opts: mountpoint: "/mnt/dicoweb_static" dicoweb_run: driver: local-persist driver_opts: mountpoint: "/mnt/dicoweb_run"
If using GNU Make, ensure that your .env
file does not define
PROFILE
variable and start up the system. Otherwise, if using
docker compose
directly, start up the system as follows:
docker compose up -d
If you run Apache
web server, modify your virtual host configuration
as follows:
<VirtualHost *:80> ServerName dicoweb.example.com ProxyPreserveHost On <Location /> ProxyPass http://127.0.0.1:8080/ </Location> <Location /static> ProxyPass ! </Location> Alias /static /home/gray/src/dico/dictdbtest/vol/static <Location /favicon.ico> ProxyPass ! </Location> <Location /robots.txt> ProxyPass ! </Location> Alias /favicon.ico /mnt/dicoweb_static/images/gnu-head-mini.png Alias /robots.txt /mnt/dicoweb_run/static/robots.txt <Directory /mnt/dicoweb_static> AllowOverride All Options None Require all granted </Directory> <Directory /mnt/dicoweb_run> AllowOverride All Options None Require all granted </Directory> </VirtualHost>
11.5 Use external dict server
To use an external dict server you need to supply its address to
dicoweb
and exclude the dicod
container from the startup sequence.
The first part is achieved by setting the variable DICT_SERVER
in your
.env
file, e.g.:
DICT_SERVER=dico.gnu.org.ua
The second part is achieved by modifying the dicod
service in your
docker-compose.override.yml
as shown below:
services: dicod: profiles: - disabled
The actual profile name doesn't matter. What's important is that you are
never going to use it in the --profile
option when running docker compose
.
12 Copyright
Copyright (C) 2023-2024 Sergey Poznyakoff
Permission is granted to anyone to make or distribute verbatim copies of this document as received, in any medium, provided that the copyright notice and this permission notice are preserved, thus giving the recipient permission to redistribute in turn.
Permission is granted to distribute modified versions of this document, or of portions of it, under the above conditions, provided also that they carry prominent notices stating who last changed them.