CFPEEK |
|
CFPEEK |
Sergey Poznyakoff |
The following typographic conventions are used throughout this tutorial.
In the examples, ‘$’ represents a typical shell prompt. It precedes lines you should type. Both command line and lines which represent the program output are shown in ‘this font’.
The Scheme code is shown as follows:
(do it)
In examples, the ⇒ symbol indicates the value of a variable or result of a function invocation, as in:
x ⇒ 2
A structured configuration file contains entities of two basic types. First of them is simple statement. A simple statement conceptually consists of an identifier (or keyword) and a value. Depending on the syntactic requirements, some special token may be required between them (such as an equals sign, for example), or at the end of the statement. The value, though we use the term in singular, is not necessarily a single scalar value, it may as well be a list of values (the exact form of that list depends on the particular syntax of the configuration file).
Another basic entity is compound statement, also known as block statement or section. Compound statement is used for logical grouping of other statements. It consists of identifier, an optional tag and a list of statements. The tag, if present, is similar to the value in simple statements. The same notes that we made about values apply to tags as well. Tags serve to discern between the statements having the same identifier. The list of statements may include statements of both kinds: simple as well as compound ones. Thus, compound statements form a tree-like structure of arbitrary depth, with simple statements as leaf nodes.
Each compound statement can have any number of subordinate statements, which are called its child statements. Each statement (no matter simple or compound) has only one parent statement, i.e. a compound statement of which it is a child.
A special implicit statement, called root statement, serves as the parent for the statements at the topmost level of hierarchy.
Given this hierarchical structure, each statement can be identified by the list of keywords and values (when present) of all compound statements that must be traversed in order to reach that statement. Such a list, written according to a set of conventions, is called a full pathname of the statement. The conventions are:
A pathname which begins with a component separator (‘.’) is called absolute pathname and identifies the statement with relation to the topmost level of hierarchy.
A pathname beginning with an identifier is called relative and identifies the statement in relation to the statement represented by that identifier.
Examples of absolute pathnames are:
.database.description .acl=global.deny .view=external.zone=com.type
Examples of relative pathnames are:
description zone=com.type
The following configuration file will assist us in further discussion. Its syntax is fairly straightforward:
A simple statement is written as identifier followed value. The two parts are separated by any amount of whitespace. Simple statements are terminated by semicolon.
A compound statement is written as identifier followed by a list of subordinate statements in curly braces. A tag (if present) is put between the identifier and the opening curly brace.
These syntax conventions roughly correspond to the Grecs
configuration format, which cfpeek
assumes by default
(see grecs).
user smith; group mail; pidfile "/var/run/example"; logging { facility daemon; tag example; } program a { command "a.out"; logging { facility local0; tag a; } } program b { command "b.out"; wait yes; pidfile /var/run/b.pid; }
The only argument cfpeek
requires is the name of the file to
parse. If no other arguments are given, it produces on the standard
output a listing of that file in pathname-value form. Each
simple statement in the input file is represented by a single line in
the output listing. The line consists of two main parts: the full
pathname of that statement and its value. The two parts are separated
by a colon and space character. For example:
$ cfpeek sample.conf .user: smith .group: mail .pidfile: /var/run/example .logging.facility: daemon .logging.tag: example .program="a".command: a.out .program="a".logging.facility: local0 .program="a".logging.tag: a .program="b".command: b.out .program="b".wait: yes .program="b".pidfile: /var/run/b.pid
This output can be customized via the --format (-H) command line option. This option takes a list of output flags, each of which modifies some aspect of the output. Most output flags are boolean, i.e. they enable or disable the given feature. To disable the feature, the flag must be prefixed with ‘no’.
To list only the pathnames, use
$ cfpeek --format=path sample.conf .user .group .pidfile .logging.facility .logging.tag .program="a".command .program="a".logging.facility .program="a".logging.tag .program="b".command .program="b".wait .program="b".pidfile
The default output is equivalent to --format=path,value,descend.
The flags ‘path’ and ‘value’ mean to print the pathname of
the statement and its value. The ‘descend’ flag affects the
output of compound nodes. If this flag is set and a node matching the
key is a compound node, cfpeek
will output this node and all
nodes below it (i.e. its descendant nodes). The ‘descend’ flag
is meaningful only if at least one lookup key is supplied.
You can also use --format to change the default component delimiter. For example, to use slash to delimit components:
$ cfpeek --format=delim=/ sample.conf /user: smith /group: mail /pidfile: /var/run/example /logging/facility: daemon /logging/tag: example /program="a"/command: a.out /program="a"/logging/facility: local0 /program="a"/logging/tag: a /program="b"/command: b.out /program="b"/wait: yes /program="b"/pidfile: /var/run/b.pid
When given more than one argument, cfpeek
treats the rest of
arguments as search keys. It then searches for statements with
pathnames matching each of the keys and outputs them. A key can be
either a pathname, or a pattern.
The following command looks for the ‘pidfile’ statement at the topmost level of hierarchy and prints it:
$ cfpeek sample.conf .pidfile .pidfile: /var/run/example
As you see, it uses the same output format as with full listings. If you wish to change it, use the --format option, introduced in the previous section. For example, to retrieve only the value:
$ cfpeek --format=value sample.conf .pidfile /var/run/example
This approach is quite common when cfpeek
is used in shell
scripts. It will be illustrated in more detail below.
If a key is not found, cfpeek
prints a message on the standard
error and starts searching for the next key (if any). When all keys
are exhausted, the program exits with status 1 to indicate that some
of them have not been found. To suppress the diagnostics output, use
the --quiet (-q) option.
To illustrate all this, the following example shows how to use
cfpeek
in a start-up script to check whether a program has
already been started and to bring it down, if requested:
#! /bin/sh pidfile=`cfpeek -q --format=value sample.conf .pidfile` if test -f $pidfile; then pid=`head -1 $pidfile` else pid= fi case $1 in start) if test -n "$pid"; then echo >&2 "the program is already running" else # start the program sample-start fi ;; status) if test -n "$pid"; then echo "program is running at pid $pid" else echo "program is not running" fi ;; stop) test -n "$pid" && kill -TERM $pid ;; esac
Apart from literal pathname, a pathname pattern is allowed as a key. A pattern can contain wildcards in place of path components. Two wildcards are defined: ‘*’ and ‘%’. A ‘%’ matches any single keyword:
$ cfpeek sample.conf .%.pidfile .program="b".pidfile: /var/run/b.pid
A ‘*’ wildcard matches zero or more keywords appearing in its place:
$ cfpeek sample.conf .*.pidfile .pidfile: /var/run/example .program="b".pidfile: /var/run/b.pid
In addition to these wildcards, tags in a pattern can contain traditional globbing patterns, as described in http://www.manpagez.com/man/3/fnmatch.
$ cfpeek sample.conf '.program=[ab].pidfile' .program="b".pidfile: /var/run/b.pid
Pattern lookups can be disabled using the --literal (-L) command line option. There may be two reasons for doing so. First, literal lookups are somewhat faster, so if you don’t need pattern matching using --literal can save you a couple of CPU cycles. Secondly, if any of your identifiers contain ‘*’ or ‘%’ characters, you will have to use --literal to prevent them from being treated as wildcards.
Cfpeek
can handle input files in various formats. The
default one is ‘Grecs’ format, introduced in previous sections.
To process input files of another format, specify the parser to
use via the --parser (-p) command line option. The
argument to this option is one of: ‘grecs’, ‘bind’,
‘path’, ‘meta1’ or ‘git’. See Formats, for a
detailed description of each of these formats.
For example, to select zone statements from the /etc/named.conf file:
$ cfpeek --parser=bind /etc/named.conf '.*.zone'
Sometimes you may need to see not the node which matched the search key, but its parent or other ancestor node. Consider, for example, the following task: select from the /etc/named.conf file the names of all zones for which this nameserver is a master. To do so, you will need to find all ‘zone.type’ statements with the value ‘master’, ascend to the parent node and print its value.
Cfpeek
provides several special formatting flags to that
effect: up
, down
, parent
, child
and
sibling
. They are called relative movement flags,
because they select another node in the tree, relative to the position
of the current node.
The up
flag takes an integer number as its argument. It
instructs cfpeek
to ascend that many parent nodes before
actually printing the node. For example, --format=up=1 means
“ascend to the parent of the matched node and print it”. This is
exactly what we need to solve the above task, since the ‘type’
statement is a child of a ‘zone’ statement. Thus, the solution
is:
cfpeek --format=up=1,nodescend,value --parser=bind \ /etc/named.conf .*.type=master
The value
flag indicates that we want on output only values, without
the corresponding pathnames. The nodescend
flag tells
cfpeek
to not descend into compound statements when
outputting them. It is necessary since we want only values of all
relevant ‘zone’ statements, no their subordinate statements.
A counterpart of this flag is down=n
flag, which descends
n levels of hierarchy.
The parent
flag acts in the similar manner, but it identifies
the ancestor by its keyword, instead of the relative nesting level.
The statement
--format=parent=zone
tells cfpeek
, after finding a matching node, to ascend until
a node with the identifier ‘zone’ is found, and then print this node.
The child=id
statement does the opposite of
parent
: it locates a child of the current node which has the
identifier id.
Similarly, the sibling
keyword instructs cfpeek
to
find first sibling of the current node wich has the given identifier.
For example, to find names of the zone files for all master nodes in
the named.conf file:
cfpeek --parser bind --format=sibling=file,value /etc/named.conf \ '.*.zone.type=master'
A ‘file’ statement is located on the same nesting level as ‘type’, for example:
zone "example.net" { type master; file "db.example.net"; };
Thus, the above command first locates the ‘type’ statement, then searches on the same nesting level for a ‘file’ statement, and finally prints its value.
Cfpeek
offers a scripting facility, which can be used to
easily extend its functionality beyond the basic operations, described
in previous chapters. Scripts must be written in Scheme, using ‘Guile’,
the GNU’s Ubiquitous Intelligent Language for Extensions. For
information about the language, refer to Revised(5)
Report on the Algorithmic Language Scheme. For a detailed
description of Guile and its features, see
Overview in The Guile Reference Manual.
This section assumes that the reader has sufficient knowledge about this programming language.
The scripting facility is enabled by the use of the --expression
(-e) of --file (-f command line options.
The --expression (-e) option takes as its argument a
Scheme expression, which will be executed for each statement matching
the supplied keys (or for each statement in the tree, if no keys were
supplied). The expression can obtain information about the statement
from the global variable node
, which represents a node in the
parse tree describing this statement. The node contains complete
information about the statement, including its location in the source
file, its type and neighbor nodes, etc. A number of functions is
provided to retrieve that information from the node. These functions
are discussed in detail in Scripting.
Let’s start from the simplest example. The following command prints all nodes in the file:
$ cfpeek --expression='(display node)(newline)' sample.conf #<node .user: "smith"> #<node .group: "mail"> #<node .pidfile: "/var/run/example"> #<node .logging.facility: "daemon"> #<node .logging.tag: "example"> #<node .program="a".command: "a.out"> #<node .program="a".logging.facility: "local0"> #<node .program="a".logging.tag: "a"> #<node .program="b".command: "b.out"> #<node .program="b".wait: "yes"> #<node .program="b".pidfile: "/var/run/b.pid">
The format shown in this example is the default Scheme representation for nodes. You can use accessor functions to format the output to your liking. For instance, the function ‘grecs-node-locus’ returns the location of the node in the input file. The returned value is a cons, with the file name as its car and the line number as its cdr. Thus, you can print statement locations with the following command:
cfpeek --expr='(let ((loc grecs-node-locus)) (format #t "~A:~A~%" (car loc) (cdr loc)))' \ sample.conf
Complex expressions are cumbersome to type in the command line,
therefore the --file (-f) option is provided. This
option takes the name of the script file as its argument. This file
must define the function named cfpeek
which takes a node as its
argument. The script file is then loaded and the cfpeek
function is called for each matching node.
Now, if we put the expression used in the previous example in a script file (e.g. locus.scm):
(define (cfpeek node) (let ((loc grecs-node-locus)) (format #t "~A:~A~%" (car loc) (cdr loc))))
then the example can be rewritten as:
$ cfpeek -f locus.scm sample.conf
When both --file and --expression options are used in the
same invocation, the cfpeek
function is not invoked by default.
In fact, it even does not need to be defined. When used this way,
cfpeek
first loads the requested script file, and then
applies the expression to each matching node, the same way it always
does when --expression is supplied. It is the responsibility of
the expression itself to call any function or functions defined in the
file. This way of invoking ‘cfpeek’ is useful for supplying
additional parameters to the script. For example:
$ cfpeek -f script.scm -e '(process-node node #t)' input.conf
It is supposed that the function process-node
is defined
somewhere in script.scm and takes two arguments: a node and a
boolean.
The --init=expr (-i expr) option provides an initialization expression expr. This expression is evaluated once, after loading the script file, if one is specified, and before starting the main loop.
Similarly, the option --done=expr (-d expr) introduces a Scheme expression to be evaluated at the end of the run, after all nodes have been processed.
Here is a more practical example of Scheme scripting. This script converts entire parse tree into a GIT configuration file format. The format itself is described in git.
The script traverses entire tree itself, so it must be called only once, for the root node of the parse tree. The root node is denoted by a single dot, so the invocation syntax is:
cfpeek -f togit.scm sample.conf .
Traversal is performed by the main function, cfpeek
, using the
grecs-node-next
and grecs-node-down
functions. The
grecs-node-next
function returns a node which follows its
argument at the same nesting level. For example, if n is the
very first node in our sample parse tree, then:
n ⇒ #<node .user: "smith"> (grecs-node-next n) ⇒ #<node .group: "mail">
Similarly, the grecs-node-down
function returns the first
subordinate node of its argument. For example:
n ⇒ #<node .logging> (grecs-node-down n) ⇒ #<node .logging.facility: "daemon">
Both functions return ‘#f’ if there are no next or subordinate node, correspondingly.
The grecs-node-type
function is used to determine how to handle
that particular node. It returns a type of the node given to it
as argument. The type is an integer constant, with the following
possible values:
Type | The node is |
---|---|
grecs-node-root | the root (topmost) node |
grecs-node-stmt | a simple statement |
grecs-node-block | a compound (block) statement |
The print-section
function prints a GIT section header
corresponding to its node. It ascends the parent node chain
to find the topmost node and prints the traversed nodes in the correct
order.
To summarize, here is the listing of the togit.scm script:
(define (print-section node delim) "Print a Git section header for the given node. End it with delim. The function recursively calls itself until the topmost node is reached. " (cond ((grecs-node-up? node) ;; Ascend to the parent node (print-section (grecs-node-up node) #\space) ;; Print its identifier, ... (display (grecs-node-ident node)) (if (grecs-node-has-value? node) ;; ... value, (begin (display " ") (display (grecs-node-value node)))) ;; ... and delimiter (display delim)) (else ;; mark the root node (display "[")))) ;; with a [ (define (cfpeek node) "Main entry point. Calls itself recursively to descend into subordinate nodes and to iterate over nodes on the same nesting level (tail recursion)." (let loop ((node node)) (if node (let ((type (grecs-node-type node))) (cond ((= type grecs-node-root) (let ((dn (grecs-node-down node))) ;; Each statement in a Git config file must ;; belong to a section. If the first node ;; is not a block statement, provide the ;; default [core] section: (if (not (= (grecs-node-type dn) grecs-node-block)) (display "[core]\n")) ;; Continue from the first node (loop dn))) ((= type grecs-node-block) ;; print the section header (print-section node #\]) (newline) ;; descend into subnodes (loop (grecs-node-down node)) ;; continue from the next node (loop (grecs-node-next node))) ((= type grecs-node-stmt) ;; print the simple statement (display #\tab) (display (grecs-node-ident node)) (display " = ") (display (grecs-node-value node)) (newline) ;; continue from the next node (loop (grecs-node-next node))))))))
If run on our sample configuration file, it produces:
$ cfpeek -f togit.scm sample.conf . [core] user = smith group = mail pidfile = /var/run/example [logging] facility = daemon tag = example [program a] command = a.out [program a logging] facility = local0 tag = a [program b] command = b.out wait = yes pidfile = /var/run/b.pid
This document was generated on January 7, 2021 using makeinfo.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.