The mail filtering language, or MFL, is a special language designed for writing filter scripts. It has a simple syntax, similar to that of Bourne shell. In contrast to the most existing programming languages, MFL does not have any special terminating or separating characters (like, e.g. newlines and semicolons in shell)10. All syntactical entities are separated by any amount of white-space characters (i.e. spaces, tabulations or newlines).
The following sections describe MFL syntax in detail.
• Comments | Comments. | |
• include | ||
• line | ||
• Generated warnings and errors | ||
• Pragmas | Pragmatic comments. | |
• Data Types | ||
• Numbers | ||
• Literals | ||
• Here Documents | ||
• Sendmail Macros | ||
• Constants | ||
• Variables | ||
• Back references | ||
• Handlers | ||
• Special handlers | Initialization and cleanup handlers. | |
• Functions | Functions. | |
• Expressions | Expressions. | |
• Shadowing | Variable and Constant Shadowing. | |
• Statements | ||
• Conditionals | Conditional Statements. | |
• Loops | Loop Statements. | |
• Exceptions | Exceptional Conditions and their Handling. | |
• Polling | Sender Verification Tests. | |
• Modules | Modules are Collections of Useful Functions. | |
• mfmod | Dynamically Loaded Modules. | |
• Preprocessor | Input Text Is Preprocessed. | |
• Filter Script Example | A Working Filter Script Explained. | |
• Reserved Words | A Reference List of Reserved Words. |
Two types of comments are allowed: C-style, enclosed between ‘/*’ and ‘*/’, and shell-style, starting with ‘#’ character and extending up to the end of line:
/* This is a comment. */ # And this too.
There are, however, several special cases, where the characters following ‘#’ are not ignored:
If the first line begins with ‘#!/’ or ‘#! /’, this is treated as a start of a multi-line comment, which is closed by the characters ‘!#’ on a line by themselves. This feature allows for writing sophisticated scripts. See top-block, for a detailed description.
A ‘#’ is followed by ‘include’, ‘include_once’, ‘line’, ‘error’, or ‘warning’ is treated specially. These cases are covered by the subsequent sections.
If ‘#’ is followed by word ‘include’ (with optional whitespace between them), this statement requires inclusion of the specified file, as in C. There are two forms of the ‘#include’ statement:
#include <file>
#include "file"
The quotes around file in the second form quotes are optional.
Both forms are equivalent if file is an absolute file name. Otherwise, the first form will look for file in the include search path. The second one will look for it in the current working directory first, and, if not found there, in the include search path.
The default include search path is:
where prefix is the installation prefix.
New directories can be appended in front of it using -I
(--include-path) command line option, or include-path
configuration statement (see include-path).
For example, invoking
$ mailfromd -I/var/mailfromd -I/com/mailfromd
creates the following include search path
Along with #include
, there is also a special form
#include_once
, that has the same syntax:
#include_once <file> #include_once "file"
This form works exactly as #include
, except that, if the
file has already been included, it will not be included
again. As the name suggests, it will be included only once.
This form should be used to prevent re-inclusions of a code, which can cause problems due to function redefinitions, variable reassignments etc.
Next: Generated warnings and errors, Previous: include, Up: MFL [Contents][Index]
A line in the form
#line number "identifier"
causes the MFL compiler to believe, for purposes of error diagnostics, that the line number of the next source line is given by number and the current input file is named by identifier. If the identifier is absent, the remembered file name does not change.
Line directives in cpp
style are also understood:
# number "identifier"
If ‘#’ is followed by the word ‘warning’, any amount of whitespace and a string in double quotes, the compiler will use that string to generate a warning message at that point. As usual, whitespace characters are allowed between ‘#’ and ‘warning’:
# warning "The code below is suspicious"
Similarly, ‘#’ followed by the word ‘error’, whitespace and a doubly-quoted string causes the compiler to generate a compilation error at that point.
To use backslash or double quote in the message text, precede them with a single slash, e.g.:
#error "the \"quoted\" text"
A backslash in front of any other character is retained.
Next: Data Types, Previous: Generated warnings and errors, Up: MFL [Contents][Index]
If ‘#’ is immediately followed by word ‘pragma’ (with optional whitespace between them), such a construct introduces a pragmatic comment, i.e. an instruction that controls some configuration setting.
The available pragma types are described in the following subsections.
• prereq | Pragma prereq. | |
• stacksize | Pragma stacksize. | |
• regex | Pragma regex. | |
• dbprop | Pragma dbprop. | |
• greylist | Pragma greylist. | |
• miltermacros | Pragma miltermacros. | |
• provide-callout | Pragma provide-callout. |
The #pragma prereq
statement ensures that the correct
mailfromd
version is used to compile the source file it
appears in. It takes version number as its arguments and produces
a compilation error if the actual mailfromd
version number
is earlier than that. For example, the following statement:
#pragma prereq 7.0.94
results in error if compiled with mailfromd
version 7.0.93
or prior.
The stacksize
pragma sets the initial size of the run-time
stack and may also define the policy of its growing, in case it
becomes full. The default stack size is 4096 words. You may
need to increase this number if your configuration program uses
recursive functions or does an excessive amount of string manipulations.
Sets stack size to size units. Optional incr and max define stack growth policy (see below). The default units are words. The following example sets the stack size to 7168 words:
#pragma stacksize 7168
The size may end with a unit size suffix:
Suffix | Meaning |
---|---|
k | Kiloword, i.e. 1024 words |
m | Megawords, i.e. 1048576 words |
g | Gigawords, |
t | Terawords (ouch!) |
File suffixes are case-insensitive, so the following two pragmas are
equivalent and set the stack size to 7*1048576 = 7340032
words:
#pragma stacksize 7m #pragma stacksize 7M
When the MFL engine notices that there is no more stack space available, it attempts to expand the stack. If this attempt succeeds, the operation continues. Otherwise, a runtime error is reported and the execution of the filter stops.
The optional incr argument to #pragma stacksize
defines growth
policy for the stack. Two growth policies are implemented:
fixed increment policy, which expands stack in a fixed
number of expansion chunks, and exponential growth policy, which
duplicates the stack size until it is able to accommodate the needed
number of words. The fixed increment policy is the default. The default
chunk size is 4096 words.
If incr is the word ‘twice’, the duplicate policy is selected. Otherwise incr must be a positive number optionally suffixed with a size suffix (see above). This indicates the expansion chunk size for the fixed increment policy.
The following example sets initial stack size to 10240, and expansion chunk size to 2048 words:
#pragma stacksize 10M 2K
The pragma below enables exponential stack growth policy:
#pragma stacksize 10240 twice
In this case, when the run-time evaluator hits the stack size limit, it expands the stack to twice the size it had before. So, in the example above, the stack will be sequentially expanded to the following sizes: 20480, 40960, 81920, 163840, etc.
The optional max argument defines the maximum size of the stack. If stack grows beyond this limit, the execution of the script will be aborted.
If you are concerned about the execution time of your script, you
may wish to avoid stack reallocations. To help you find out the
optimal stack size, each time the stack is expanded,
mailfromd
issues a warning in its log file, which looks like
this:
warning: stack segment expanded, new size=8192
You can use these messages to adjust your stack size configuration settings.
The ‘#pragma regex’, controls compilation of regular expressions. You can use any number of such pragma directives in your mailfromd.mfl. The scope of ‘#pragma regex’ extends to the next occurrence of this directive or to the end of the script file, whichever occurs first.
The optional push|pop parameter is one of the words ‘push’ or ‘pop’ and is discussed in detail below. The flags parameter is a whitespace-separated list of regex flags. Each regex-flag is a word specifying some regex feature. It can be preceded by ‘+’ to enable this feature (this is the default), by ‘-’ to disable it or by ‘=’ to reset regex flags to its value. Valid regex-flags are:
Use POSIX Extended Regular Expression syntax when interpreting regex. If not set, POSIX Basic Regular Expression syntax is used.
Do not differentiate case. Subsequent regex searches will be case insensitive.
Match-any-character operators don’t match a newline.
A non-matching list (‘[^...]’) not containing a newline does not match a newline.
Match-beginning-of-line operator (‘^’) matches the empty string immediately after a newline.
Match-end-of-line operator (‘$’) matches the empty string immediately before a newline.
For example, the following pragma enables POSIX extended, case insensitive matching (a good thing to start your mailfromd.mfl with):
#pragma regex +extended +icase
Optional modifiers ‘push’ and ‘pop’ can be used to maintain a stack of regex flags. The statement
#pragma regex push [flags]
saves current regex flags on stack and then optionally modifies them as requested by flags.
The statement
#pragma regex pop [flags]
does the opposite: restores the current regex flags from the top of stack and applies flags to it.
This statement is useful in module and include files to avoid disturbing user regex settings. E.g.:
#pragma regex push +extended +icase . . . #pragma regex pop
This pragma configures properties for a DBM database. See Database functions, for its detailed description.
Next: miltermacros, Previous: dbprop, Up: Pragmas [Contents][Index]
Selects the greylisting implementation to use. Allowed values for type are:
Use the traditional greylisting implementation. This is the default.
Use Con Tassios greylisting implementation.
See greylisting types, for a detailed description of these greylisting implementations.
Notice, that this pragma can be used only once. A second use of this pragma would constitute an error, because you cannot use both greylisting implementations in the same program.
Next: provide-callout, Previous: greylist, Up: Pragmas [Contents][Index]
Declare that the Milter stage handler uses MTA macro listed as the rest of arguments. The handler must be a valid handler name (see Handlers).
The mailfromd
parser collects the names of the macros
referred to by a ‘$name’ construct within a handler
(see Sendmail Macros) and declares them automatically for
corresponding handlers. It is, however, unable to track macros
used in functions called from handler as well as those referred to
via getmacro
and macro_defined
functions. Such
macros should be declared using ‘#pragma miltermacros’.
During initial negotiation with the MTA,
mailfromd
will ask it to export the macro names declared
automatically or by using the ‘#pragma miltermacros’. The
MTA is free to honor or to ignore this request. In
particular, Sendmail versions prior to 8.14.0 and Postfix versions
prior to 2.5 do not support this feature. If you use one of these,
you will need to export the needed macros explicitly in the
MTA configuration. For more details, refer to the section
in MTA Configuration corresponding to your MTA type.
Previous: miltermacros, Up: Pragmas [Contents][Index]
The #pragma provide-callout
statement is used in the
callout module to inform mailfromd
that the
module has been loaded.
Do not use this pragma.
The mailfromd
filter script language operates on entities
of two types: numeric and string.
The numeric type is represented internally as a signed long integer. Depending on the machine architecture, its size can vary. For example, on machines with Intel-based CPUs it is 32 bits long.
A string is a string of characters of arbitrary length. Strings can contain any characters except ASCII NUL.
There is also a generic pointer, which is designed to
facilitate certain operations. It appears only in the body
handler. See body handler, for more information about it.
Next: Literals, Previous: Data Types, Up: MFL [Contents][Index]
A decimal number is any sequence of decimal digits, not beginning with ‘0’.
An octal number is ‘0’ followed by any number of octal
digits (‘0’ through ‘7’), for example: 0340
.
A hex number is ‘0x’ or ‘0X’ followed by any number
of hex digits (‘0’ through ‘9’ and ‘a’ through ‘f’
or ‘A’ through ‘F’), for example: 0x3ef1
.
Next: Here Documents, Previous: Numbers, Up: MFL [Contents][Index]
A literal is any sequence of characters enclosed in single or double quotes.
After tempfail
and reject
actions two special kinds of
literals are recognized: three-digit numeric values represent
RFC 2821 reply codes, and literals consisting of tree digit
groups separated by dots represent an extended reply code as per
RFC 1893/2034. For example:
510 # A reply code 5.7.1 # An extended reply code
String literals enclosed in double quotation marks (double-quoted strings) are subject to backslash interpretation, macro expansion, variable interpretation and back reference interpretation.
Backslash interpretation is performed at compilation time. It consists in replacing the following escape sequences with the corresponding single characters:
Sequence | Replaced with |
\a | Audible bell character (ASCII 7) |
\b | Backspace character (ASCII 8) |
\f | Form-feed character (ASCII 12) |
\n | Newline character (ASCII 10) |
\r | Carriage return character (ASCII 13) |
\t | Horizontal tabulation character (ASCII 9) |
\v | Vertical tabulation character (ASCII 11) |
In addition, the sequence ‘\newline’ has the same effect as ‘\n’, for example:
"a string with\ embedded newline" "a string with\n embedded newline"
Any escape sequence of the form ‘\xhh’, where h denotes any hex digit is replaced with the character whose ASCII value is hh. For example:
"\x61nother" ⇒ "another"
Similarly, an escape sequence of the form ‘\0ooo’, where o is an octal digit, is replaced with the character whose ASCII value is ooo.
Macro expansion and variable interpretation occur at run-time. During
these phases all Sendmail macros (see Sendmail Macros),
mailfromd
variables (see Variables), and constants
(see Constants) referenced in the string are replaced by their
actual values. For example, if the Sendmail macro f
has the
value ‘postmaster@gnu.org.ua’ and the variable last_ip
has the value ‘127.0.0.1’, then the
string11
"$f last connected from %last_ip;"
will be expanded to
"postmaster@gnu.org.ua last connected from 127.0.0.1;"
A back reference is a sequence ‘\d’, where d
is a decimal number. It refers to the dth parenthesized
subexpression in the last matches
statement12. Any back reference occurring within a
double-quoted string is replaced by the value of the corresponding
subexpression. See Special comparisons, for a detailed
description of this process. Back reference interpretation is
performed at run time.
Any characters enclosed in single quotation marks are read unmodified.
The following examples contain pairs of equivalent strings:
"a string" 'a string' "\\(.*\\):" '\(.*\):'
Notice the last example. Single quotes are particularly useful in writing regular expressions (see Special comparisons).
Next: Sendmail Macros, Previous: Literals, Up: MFL [Contents][Index]
Here-document is a special form of a string literal is, allowing to specify multiline strings without having to use backslash escapes. The format of here-documents is:
<<[flags]word … word
The <<word
construct instructs the parser to read all
the following lines up to the line containing only word, with
possible trailing blanks. The lines thus read are concatenated
together into a single string. For example:
set str <<EOT A multiline string EOT
The body of a here-document is interpreted the same way as
double-quoted strings (see Double-quoted strings). For example,
if Sendmail macro f
has the value jsmith@some.com
and
the variable count
is set to 10
, then the following string:
set s <<EOT <$f> has tried to send %count mails. Please see docs for more info. EOT
will be expanded to:
<jsmith@some.com> has tried to send 10 mails. Please see docs for more info.
If the word is quoted, either by enclosing it in single quote characters or by prepending it with a backslash, all interpretations and expansions within the document body are suppressed. For example:
set s <<'EOT' The following line is read verbatim: <$f> has tried to send %count mails. Please see docs for more info. EOT
Optional flags in the here-document construct control the way
leading white space is handled. If flags is -
(a dash),
then all leading tab characters are stripped from input lines and the
line containing word. Furthermore, if -
is followed by a
single space, all leading whitespace is stripped from them. This
allows here-documents within configuration scripts to be indented in a
natural fashion. Examples:
<<- TEXT <$f> has tried to send %count mails. Please see docs for more info. TEXT
Here-documents are particularly useful with reject
actions
(see reject and tempfail syntax).
Next: Constants, Previous: Here Documents, Up: MFL [Contents][Index]
Sendmail macros are referenced exactly the same way they are in
sendmail.cf configuration file, i.e. ‘$name’,
where name represents the macro name. Notice, that the notation
is the same for both single-character and multi-character macro names.
For consistency with the Sendmail
configuration the
‘${name}’ notation is also accepted.
Another way to reference Sendmail macros is by using function
getmacro
(see Macro access).
Sendmail macros evaluate to string values.
Notice, that to reference a macro, you must properly export it in
your MTA configuration. Attempt to reference a not exported
macro will result in raising a e_macroundef
exception at the run time
(see uncaught exceptions).
Next: Variables, Previous: Sendmail Macros, Up: MFL [Contents][Index]
A constant is a symbolic name for an MFL value.
Constants are defined using const
statement:
[qualifier] const name expr
where name is an identifier, and expr is any valid
MFL expression evaluating immediately to a constant literal
or numeric value. Optional qualifier defines the scope of
visibility for that constant (see scope of visibility): either
public
or static
.
Once defined, any appearance of name in the program text is replaced by its value. For example:
const x 10/5 const text "X is "
defines the numeric constant ‘x’ with the value ‘5’, and the literal constant ‘text’ with the value ‘X is ’.
A special construct is provided to define a series of numeric constants (an enumeration):
[qualifier] const do name0 [expr0] name1 [expr1] ... nameN [exprN] done
Each exprN, if present, must evaluate to a constant numeric expression. The resulting value will be assigned to constant nameN. If exprN is not supplied, the constant will be defined to the value of the previous constant plus one. If expr0 is not supplied, 0 is assumed.
For example, consider the following statement
const do A B C 10 D done
This defines ‘A’ to 0, ‘B’ to 1, ‘C’ to 10 and ‘D’ to 11.
As a matter of fact, exprN may also evaluate to a constant string expression, provided that all expressions in the enumeration ‘const’ statement are provided. That is, the following is correct:
const do A "one" B "two" C "three" D "four" done
whereas the following is not:
const do A "one" B C "three" D "four" done
Trying to compile the latter example will produce:
mailfromd: filename:5.3: initializer element is not numeric
which means that mailfromd
was trying to create constant
‘B’ with the value of ‘A’ incremented by one, but was unable
to do so, because the value in question was not numeric.
Constants can be used in normal MFL expressions as well as in literals. To expand a constant within a literal string, prepend a percent sign to its name, e.g.:
echo "New %text %x" ⇒ "New X is 2"
This way of expanding constants creates an ambiguity if there happen to be a variable of the same name as the constant. See variable--constant clashes, for more information of this case and ways to handle it.
• Built-in constants |
Several constants are built into the MFL compiler. To discern them from user-defined ones, their names start and end with two underscores (‘__’).
The following constants are defined in mailfromd
version
9.0:
Expands to the name of the current source file.
Expands to the name of the current lexical context, i.e. the function or handler name.
This built-in constant is defined for alpha versions only. Its value is the Git tag of the recent commit corresponding to that version of the package. If the release contains some uncommitted changes, the value of the ‘__git__’ constant ends with the suffix ‘-dirty’.
Expands to the current line number in the input source file.
Expands to the major version number.
The following example uses __major__
constant to determine
if some version-dependent feature can be used:
if __major__ > 2
# Use some version-specific feature
fi
Expands to the minor version number.
Expands to the name of the current module (see Modules).
Expands to the package name (‘mailfromd’)
For alpha versions and maintenance releases expands to the version patch level. For stable versions, expands to ‘0’.
Expands to the default external preprocessor command line, if the preprocessor is used, or to an empty string if it is not, e.g.:
__defpreproc__ ⇒ "/usr/bin/m4 -s"
See Preprocessor, for information on preprocessor and its features.
Expands to the current external preprocessor command line, if the
preprocessor is used, or to an empty string if it is not. Notice,
that it equals __defpreproc__
, unless the preprocessor was
redefined using --preprocessor command line option
(see –preprocessor).
Expands to the textual representation of the program version (e.g. ‘3.0.90’)
Expands to the default state directory (see statedir).
Expands to the current value of the program state directory
(see statedir). Notice, that it is the same as
__defstatedir__
unless the state directory was redefined at run
time.
Built-in constants can be used as variables, this allows to expand them within strings or here-documents. The following example illustrates the common practice used for debugging configuration scripts:
func foo(number x) do echo "%__file__:%__line__: foo called with arg %x" … done
If the function foo
were called in line 28 of the
script file /etc/mailfromd.mfl
, like this:
foo(10)
, you will see the following string in your logs:
/etc/mailfromd.mfl:28: foo called with arg 10
Next: Back references, Previous: Constants, Up: MFL [Contents][Index]
Variables represent regions of memory used to hold variable data. These memory regions are identified by variable names. A variable name must begin with a letter or underscore and must consist of letters, digits and underscores.
Each variable is associated with its scope of visibility, which defines the part of source code where it can be used (see scope of visibility). Depending on the scope, we discern three main classes of variables: public, static and automatic (or local).
Public variables have indefinite lexical scope, so they may be referred to anywhere in the program. Static are variables visible only within their module (see Modules). Automatic or local variables are visible only within the given function or handler.
Public and static variables are sometimes collectively called global.
These variable classes occupy separate namespaces, so that an automatic variable can have the same name as an existing public or static one. In this case this variable is said to shadow its global counterpart. All references to such a name will refer to the automatic variable until the end of its scope is reached, where the global one becomes visible again.
Likewise, a static variable may have the same name as a static variable defined in another module. However, it may not have the same name as a public variable.
A variable is declared using the following syntax:
[qualifiers] type name
where name is the variable name, type is the type of the data it is supposed to hold. It is ‘string’ for string variables and ‘number’ for numeric ones.
For example, this is a declaration of a string variable ‘var’:
string var
If a variable declaration occurs within a function (see User-defined) or handler (see Handlers), it declares an automatic variable, local to this function or handler. Otherwise, it declares a global variable.
Optional qualifiers are allowed only in global declarations, i.e.
in the variable declarations that appear outside of functions. They
specify the scope of the variable. The public
qualifier
declares the variable as public and the static
qualifier
declares it as static. The default scope is ‘public’,
unless specified otherwise in the module declaration (see module structure).
Additionally, qualifiers may contain the word precious
,
which instructs the compiler to mark this variable as precious.
(see precious variables). The value of the precious variable
is not affected by the SMTP ‘RSET’ command. If both
scope qualifier and precious
are used, they may appear in any
order, e.g.:
static precious string rcpt_list
or
precious static string rcpt_list
Declaration can be followed by any valid MFL expression, which supplies the initial value or initializer for the variable, for example:
string var "test"
A variable declared without initializer is implicitly initialized to a null value, no matter what its scope: a numeric variable assumes initial value 0, a string variables is initialized to an empty string.
A variable is assigned a value using the set
statement:
set name expr
where name is the variable name and expr is a
mailfromd
expression (see Expressions). The effect of
this statement is that the expr is evaluated and the value it
yields is assigned to the variable name.
If the set
statement is located outside a function or handler
definition, the expr must be a constant expression, i.e. the
compiler should be able to evaluate it immediately. See optimizer.
It is not an error to assign a value to a variable that is not declared. In this case the assignment first declares a global or automatic variable having the type of expr and then assigns a value to it. Automatic variable is created if the assignment occurs within a function or handler, global variable is declared if it occurs at topmost lexical level. This is called implicit variable declaration.
In the MFL program, variables are referenced by their name. When appearing inside a double-quoted string, variables are referenced using the notation ‘%name’. Any variable being referenced must have been declared earlier (either explicitly or implicitly).
• Predefined variables |
Several variables are predefined. In mailfromd
version
9.0 these are:
Identifies the current milter state (see milter state). The module milter.mfl defines the following symbolic names:
milter_state_none
milter_state_startup
milter_state_shutdown
milter_state_begin
milter_state_end
milter_state_connect
milter_state_helo
milter_state_envfrom
milter_state_envrcpt
milter_state_data
milter_state_header
milter_state_eoh
milter_state_body
milter_state_eom
milter_action
Use the milter_state_name
function to obtain the corresponding
textual string (see milter_state_name).
Identifier of the milter server which executes the code. This is the
string passed to the id
statement in the server
section
of the configuration file (see conf-server),
Address of the socket the milter server is listening to. This is
defined by the listen
statement in the server
section
of the configuration file (see conf-server),
Address family of the milter server address, as defined by the
listen
statement in the server
section of the
configuration file (see conf-server).
See the FAMILY_
constants in Table 4.3.
Address of the milter client which initiated the connection.
Address family of milter_client_address
.
See the FAMILY_
constants in Table 4.3.
This variable is set by stdpoll
and strictpoll
built-ins
(and, consequently, by the on poll
statement). Its value is
‘1’ if the function used the cached data instead of directly
polling the host, and ‘0’ if the polling took place.
See SMTP Callout functions.
You can use this variable to make your reject message more informative
for the remote party. The common paradigm is to define a function,
returning empty string if the result was obtained from polling, or
some notice if cached data were used, and to use the function in the
reject
text, for example:
func cachestr() returns string do if cache_used return "[CACHED] " else return "" fi done
Then, in prog envfrom
one can use:
on poll $f do when not_found or failure: reject 550 5.1.0 cachestr() . "Sender validity not confirmed" done
Name of virus identified by ClamAV
. Set by clamav
function (see ClamAV).
Number of seconds left to the end of greylisting period. Set by
greylist
and is_greylisted
functions (see Special test functions).
Name of the domain used by polling functions in SMTP
EHLO
or HELO
command. Default value is the fully
qualified domain name of the host where mailfromd
is run.
See Polling.
Callout functions (see SMTP Callout functions) set this variable before returning. It contains the initial SMTP reply from the last polled host.
Callout functions (see SMTP Callout functions) set this variable before
returning. It contains the reply to the HELO
(EHLO
)
command, received from the last polled host.
Callout functions (see SMTP Callout functions) set this variable before returning. It contains the host name or IP address of the last polled host.
Callout functions (see SMTP Callout functions) set this variable before returning. It contains the last SMTP reply received from the remote host. In case of multi-line replies, only the first line is stored. If nothing was received the variable contains the string ‘nothing’.
Callout functions (see SMTP Callout functions) set this variable before
returning. It contains the last SMTP command sent to the
polled host. If nothing was sent, last_poll_sent
contains the string
‘nothing’.
Email address used by polling functions in SMTP MAIL
FROM
command (see Polling.). Default is ‘<>’. Here is an
example of how to change it:
set mailfrom_address "postmaster@my.domain.com"
You can set this value to a comma-separated list of email addresses, in which case the probing will try each address until either the remote party accepts it or the list of addresses is exhausted, whichever happens first.
It is not necessary to enclose emails in angle brackets, as they will be added automatically where appropriate. The only exception is null return address, when used in a list of addresses. In this case, it should always be written as ‘<>’. For example:
set mailfrom_address "postmaster@my.domain.com, <>"
Spam score for the message, set by sa
function (see sa).
The variable rcpt_count
keeps the number of recipients given so
far by RCPT TO
commands. It is defined only in ‘envrcpt’
handlers.
Spam threshold, set by sa
function (see sa).
Spam keywords for the message, set by sa
function (see sa).
This variable controls the verbosity of the exception-safe database functions. See safedb_verbose.
A back reference is a sequence ‘\d’, where d
is a decimal number. It refers to the dth parenthesized
subexpression in the last matches
statement13. Any back reference occurring within a
double-quoted string is replaced with the value of the corresponding
subexpression. For example:
if $f matches '.*@\(.*\)\.gnu\.org\.ua' set host \1 fi
If the value of f
macro is ‘smith@unza.gnu.org.ua’, the
above code will assign the string ‘unza’ to the variable
host
.
Notice, that each occurrence of matches
will reset the table
of back references, so try to use them as early as possible. The
following example illustrates a common error, when the back
reference is used after the reference table has been reused by another
matching:
# Wrong!
if $f matches '.*@\(.*\)\.gnu\.org\.ua'
if $f matches 'some.*'
set host \1
fi
fi
This will produce the following run time error:
mailfromd: RUNTIME ERROR near file.mfl:3: Invalid back-reference number
because the inner match (‘some.*’) does not have any parenthesized subexpressions.
See Special comparisons, for more information about matches
operator.
Next: Special handlers, Previous: Back references, Up: MFL [Contents][Index]
Milter stage handler (or handler, for short) is a subroutine responsible for processing a particular milter state. There are eight handlers available. Their order of invocation and arguments are described in Figure 3.1.
A handler is defined using the following construct:
prog handler-name do handler-body done
where handler-name is the name of the handler (see handler names), handler-body is the list of filter statements composing the handler body. Some handlers take arguments, which can be accessed within the handler-body using the notation $n, where n is the ordinal number of the argument. Here we describe the available handlers and their arguments:
This handler is called once at the beginning of each SMTP connection.
string
;
The host name of the message sender, as reported by MTA. Usually it
is determined by a reverse lookup on the host address. If the reverse
lookup fails, ‘$1’ will contain the message sender’s IP address
enclosed in square brackets (e.g. ‘[127.0.0.1]’).
number
;
Socket address family. You need to require the ‘status’ module
to get symbolic definitions for the address families. Supported
families are:
Constant | Value | Meaning |
---|---|---|
FAMILY_STDIO | 0 | Standard input/output (the MTA is run with -bs option) |
FAMILY_UNIX | 1 | UNIX socket |
FAMILY_INET | 2 | IPv4 protocol |
FAMILY_INET6 | 3 | IPv6 protocol |
number
;
Port number if ‘$2’ is ‘FAMILY_INET’.
string
;
Remote IP address if ‘$2’ is ‘FAMILY_INET’ or full file name
of the socket if ‘$2’ is ‘FAMILY_UNIX’. If ‘$2’ is
‘FAMILY_STDIO’, ‘$4’ is an empty string.
The actions (see Actions) appearing in this handler
are handled by Sendmail in a special way. First of all, any textual
message is ignored. Secondly, the only action that immediately closes
the connection is tempfail 421
. Any other reply codes result in
Sendmail switching to nullserver mode, where it accepts any
commands, but answers with a failure to any of them, except for the
following: QUIT
, HELO
, NOOP
, which are processed
as usual.
The following table summarizes the Sendmail behavior depending on the action used:
tempfail 421 excode message
The caller is returned the following error message:
421 4.7.0 hostname closing connection
Both excode and message are ignored.
tempfail 4xx excode message
(where xx represents any digits, except ‘21’) Both excode and message are ignored. Sendmail switches to nullserver mode. Any subsequent command, excepting the ones listed above, is answered with
454 4.3.0 Please try again later
reject 5xx excode message
(where xx represents any digits). All arguments are ignored. Sendmail switches to nullserver mode. Any subsequent command, excepting ones listed above, is answered with
550 5.0.0 Command rejected
Regarding reply codes, this behavior complies with RFC 2821 (section 3.9), which states:
An SMTP server must not intentionally close the connection except:
[…]
- After detecting the need to shut down the SMTP service and returning a 421 response code. This response code can be issued after the server receives any command or, if necessary, asynchronously from command receipt (on the assumption that the client will receive it after the next command is issued).
However, the RFC says nothing about textual messages and
extended error codes, therefore Sendmail’s ignoring of these is,
in my opinion, absurd. My practice shows that it is often reasonable,
and even necessary, to return a meaningful textual message if the
initial connection is declined. The opinion of mailfromd
users seems to support this view. Bearing this in mind,
mailfromd
is shipped with a patch for Sendmail,
which makes it honor both extended return code and textual message given
with the action. Two versions are provided:
etc/sendmail-8.13.7.connect.diff, for
Sendmail versions 8.13.x, and
etc/sendmail-8.14.3.connect.diff, for Sendmail versions 8.14.3.
This handler is called whenever the SMTP client sends HELO
or
EHLO
command. Depending on the actual MTA configuration, it
can be called several times or even not at all.
string
; Argument to HELO
(EHLO
) commands.
According to RFC 28221, $1
must be domain name of the
sending host, or, in case this is not available, its IP address
enclosed in square brackets. Be careful when taking decisions based
on this value, because in practice many hosts send arbitrary strings.
We recommend to use heloarg_test
function
(see heloarg_test) if you wish to analyze this value.
Called when the SMTP client sends MAIL FROM
command, i.e. once
at the beginning of each message.
string
; First argument to the MAIL FROM
command,
i.e. the email address of the sender.
string
; Rest of arguments to MAIL FROM
separated
by space character. This argument can be ‘""’.
$1
is not the same as $f
Sendmail variable, because
the latter contains the sender email after address rewriting and
normalization, while $1
contains exactly the value given by
sending party.
$2
will contain
an array of arguments.
Called once for each RCPT TO
command, i.e. once for each
recipient, immediately after envfrom
.
string
; First argument to the RCPT TO
command,
i.e. the email address of the recipient.
string
; Rest of arguments to RCPT TO
separated
by space character. This argument can be ‘""’.
When the array type is implemented, $2
will contain
an array of arguments.
Called after the MTA receives SMTP ‘DATA’ command. Notice that this handler is not supported by Sendmail versions prior to 8.14.0 and Postfix versions prior to 2.5.
None
Called once for each header line received after SMTP DATA
command.
string
; Header field name.
string
; Header field value. The content of the header may
include folded white space, i.e., multiple lines with following white
space where lines are separated by LF (ASCII 10). The
trailing line terminator (CR/LF) is removed.
This handler is called once per message, after all headers have been sent and processed.
None.
This header is called zero or more times, for each piece of the message body obtained from the remote host.
pointer
; Piece of body text. See ‘Notes’ below.
number
; Length of data pointed to by $1
, in bytes.
The first argument points to the body chunk. Its size may be quite
considerable and passing it as a string may be costly both in terms of
memory and execution time. For this reason it is not passed as a
string, but rather as a generic pointer, i.e. an object having
the same size as number
, which can be used to retrieve the
actual contents of the body chunk if the need arises.
A special function body_string
is provided to convert this
object to a regular MFL string (see Mail body functions). Using it you can collect the entire body text into a
single global variable, as illustrated by the following example:
string text prog body do set text text . body_string($1,$2) done
The text collected this way can then be used in the eom
handler
(see below) to parse and analyze it.
If you wish to analyze both the headers and mail body, the following code fragment will do that for you:
string text # Collect all headers. prog header do set text text . $1 . ": " . $2 . "\n" done # Append terminating newline to the headers. prog eoh do set text "%text\n" done # Collect message body. prog body do set text text . body_string($1, $2) done
This handler is called once per message, when the terminating dot
after DATA
command has been received.
None
This handler is useful for calling message capturing functions,
such as sa
or clamav
. For more information about these,
refer to Interfaces to Third-Party Programs.
For your reference, the following table shows each handler with its arguments:
Handler | $1 | $2 | $3 | $4 |
---|---|---|---|---|
connect | Hostname | Socket Family | Port | Remote address |
helo | HELO domain | N/A | N/A | N/A |
envfrom | Sender email address | Rest of arguments | N/A | N/A |
envrcpt | Recipient email address | Rest of arguments | N/A | N/A |
header | Header name | Header value | N/A | N/A |
eoh | N/A | N/A | N/A | N/A |
body | Body segment (pointer) | Length of the segment (numeric) | N/A | N/A |
eom | N/A | N/A | N/A | N/A |
• Multiple Handler Definitions |
Any handler may be declared multiple times. When compiling the filter
program, mailfromd
combines the code from all prog
declarations having the same handler name into one code block and
compiles it. The resulting code is guaranteed to be executed in the
order in which it appears in the source files.
Apart from the milter handlers described in the previous section, MFL provides several special handlers, that serve as hooks, allowing the programmer to insert code in certain important points of the control flow.
Syntactically, special handlers are similar to milter state handlers, i.e. they are defined as:
prog handler do ... done
(handler being the handler name).
Special handlers can be subdivided into three groups.
The first group are begin
and end
handlers. These
are run at the beginning and before the end of each SMTP session and
are used to provide a session-specific initialization and cleanup
routines.
The second group are startup
and shutdown
handlers,
which provide global initialization and cleanup routines. These
handlers are invoked exactly once: startup
when
mailfromd
has started up, but hasn’t yet begun to serve
milter requests, and shutdown
when mailfromd
is about
to terminate.
Finally, the action
handler is run before executing each
reply action (see reply actions).
• begin/end | Session ‘begin’ and ‘end’ special handlers. | |
• startup/shutdown | Global startup and shutdown handlers. | |
• action hook | Action hook handler. |
Next: startup/shutdown, Up: Special handlers [Contents][Index]
These two special handlers are executed once for each session, marking its beginning and end. Neither of them takes any arguments:
# Begin handler
prog begin
do
…
done
The begin
handler is run once for each SMTP session,
after the connection has been established but before the first milter
handler has been called.
# End handler
prog end
do
…
done
The end
handler is run once for each SMTP session,
after all other handlers have finished their work and
mailfromd
has already returned the resulting status to the
MTA and closed connection.
Multiple ‘begin’ and ‘end’ handlers are a useful feature
for writing modules (see Modules), because each module can thus
have its own initialization and cleanup blocks. Notice, however, that
in this case the order in which subsequent ‘begin’ and ‘end’
blocks are executed is not defined. It is only warranted that all
‘begin’ blocks are executed at startup and all ‘end’ blocks
are executed at shutdown. It is also warranted that all ‘begin’
and ‘end’ blocks defined within a compilation unit (i.e. a single
abstract source file, with all #include
and
#include_once
statements expanded in place) are executed in
order of their appearance in the unit.
Due to their special nature, the startup and cleanup blocks impose certain restrictions on the statements that can be used within them:
return
cannot be used in ‘begin’ and ‘end’
handlers.
accept
, continue
, discard
, reject
,
tempfail
. They can, however, be used in catch
statements, declared in ‘begin’ blocks (see example below).
The ‘begin’ handlers are the usual place to put global initialization code to. For example, if you do not want to use DNS caching, you can do it this way:
prog begin do db_set_active("dns", 0) done
Additionally, you can set up global exception handling routines there. For example, the following ‘begin’ statement installs a handler for all exceptions not handled otherwise that logs the exception along with the stack trace and continues processing the message:
prog begin do catch * do echo "Caught exception $1: $2" stack_trace() continue done done
Next: action hook, Previous: begin/end, Up: Special handlers [Contents][Index]
Yet another pair of special handlers, startup
and
shutdown
, can be used for global initialization and cleanup.
The startup
handler is called exactly once, as a part of
mailfromd
startup session.
This handler is normally used in mfmod interface modules to load the shared library part (see mfmod).
This handler is called during the normal program shutdown sequence, before exiting.
Both handlers bear certain similarity to begin
and end
:
they take no arguments, and their use is subject to the same
restrictions (see begin/end restrictions). Additionally,
the following features cannot be used in global handlers:
Previous: startup/shutdown, Up: Special handlers [Contents][Index]
Action hook handler is run implicitly before executing each reply
action, such as accept
, reject
, etc. See reply actions, for a discussion of reply action statements.
Upon invocation, the handler is passed four arguments:
status
module defines the following symbolic names for action identifiers:
ACCEPT_ACTION
CONTINUE_ACTION
DISCARD_ACTION
REJECT_ACTION
TEMPFAIL_ACTION
To convert these to textual action names, use the
milter_action_name
function (see milter_action_name).
The last three arguments are meaningful only for reject
and
tempfail
actions. For the remaining three actions
(accept
, discard
, and continue
), empty strings
are passed.
The action hook handler is useful mainly for logging and accounting
purposes. For example, the code fragment below assumes that the
openmetrics
module is used (see mfmod_openmetrics in mfmod_openmetrics reference).
It increases the corresponding metrics before taking the action.
Additionally, for reject
and tempfail
actions, the
metrics ‘reject_code’ and ‘tempfail_code’ are
increased, where code is the three-digit SMTP status code being
sent to the server.
prog action do openmetrics_incr(milter_action_name($1)) switch $1 do case REJECT_ACTION: openmetrics_incr("reject_" . $2) case TEMPFAIL_ACTION: openmetrics_incr("tempfail_" . $2) done done
Next: Expressions, Previous: Special handlers, Up: MFL [Contents][Index]
A function is a named mailfromd
subroutine, which
takes zero or more parameters and optionally returns a certain
value. Depending on the return value, functions can be
subdivided into string functions and number functions.
A function may have mandatory and optional parameters.
When invoked, the function must be supplied exactly as many
actual arguments as the number of its mandatory parameters.
Functions are invoked using the following syntax:
name (args)
where name is the function name and args is a comma-separated list of expressions. For example, the following are valid function calls:
foo(10) interval("1 hour") greylist("/var/my.db", 180)
The number of parameters a function takes and their data types compose the function signature. When actual arguments are passed to the function, they are converted to types of the corresponding formal parameters.
There are two major groups of functions: built-in functions,
that are implemented in the mailfromd
binary, and
user-defined functions, that are written in MFL. The
invocation syntax is the same for both groups.
Mailfromd
is shipped with a rich set of library
functions. These are described in Library. In addition to
these you can define your own functions.
Function definitions can appear anywhere between the handler declarations in a filter program, the only requirement being that the function definition occur before the place where the function is invoked.
The syntax of a function definition is:
[qualifier] func name (param-decl) [returns data-type] do function-body done
where name is the name of the function to define, param-decl is a comma-separated list of parameter declarations. The syntax of the latter is the same as that of variable declarations (see Variable declarations), i.e.:
type name
declares the parameter name having the type type. The
type is string
or number
.
Optional qualifier declares the scope of visibility for that function (see scope of visibility). It is similar to that of variables, except that functions cannot be local (i.e. you cannot declare function within another function).
The public
qualifier declares a function that may be referred
to from any module, whereas the static
qualifier declares a
function that may be called only from the current module
(see Modules). The default scope is ‘public’,
unless specified otherwise in the module declaration (see module structure).
For example, the following declares a function ‘sum’, that takes two numeric arguments and returns a numeric value:
func sum(number x, number y) returns number
Similarly, the following is a declaration of a static function:
static func sum(number x, number y) returns number
Parameters are referenced in the function-body by their name,
the same way as other variables. Similarly, the value of a parameter can be
altered using set
statement.
A function can be declared to take a certain number of optional
arguments. In a function declaration, optional abstract arguments
must be placed after the mandatory ones, and must be separated from
them with a semicolon. The following example is a definition of
function foo
, which takes two mandatory and two optional
arguments:
func foo(string msg, string email; number x, string pfx)
Mandatory parameters are: msg
and email
. Optional
parameters are: x
and pfx
. The actual number of
arguments supplied to the function is returned by a special construct
$#
. In addition, the special construct @arg
evaluates to the ordinal number of variable arg in the list of
formal parameters (the first argument has number ‘0’). These two
constructs can be used to verify whether an argument is supplied to
the function.
When an actual argument for parameter n
is supplied, the number
of actual arguments ($#
) is greater than the ordinal number
of that parameter in the declaration list (@n
). Thus,
the following construct can be used to check if an optional argument
arg is actually supplied:
func foo(string msg, string email; number x, string arg) do if $# > @arg … fi
The default mailfromd
installation provides a special
macro for this purpose: see defined. Using it, the example above
could be rewritten as:
func foo(string msg, string email; number x, string arg) do if defined(arg) … fi
Within a function body, optional arguments are referenced exactly the same way as the mandatory ones. Attempt to dereference an optional argument for which no actual parameter was supplied, results in an undefined value, so be sure to check whether a parameter is passed before dereferencing it.
A function can also take variable number of arguments (such
functions are called variadic). This is
indicated by ellipsis in place of the last abstract parameter name. The
statement below defines a function foo
taking one mandatory, one
optional and any number of additional arguments:
func foo (string a ; string b, string ...)
The data type before the ellipsis indicates the type to promote all
actual arguments to. If it is omitted, string
is assumed, so
the above declaration can also be written as:
func foo (string a ; string b, ...)
To refer to the actual arguments in the function body, the following construct is used:
$(expr)
where expr is any valid MFL expression, evaluating to
a number n. This construct refers to the value of nth
actual parameter from the variable argument list. Parameters are
numbered from ‘1’, so the first variable parameter is $(1)
,
and the last one is $($# - Nm - No)
, where Nm
and No are numbers of mandatory and optional parameters to the
function.
The construct ‘$(n)’ where 1 <= n <= 9 can also be written as ‘$n’.
For example, the function below prints all its arguments:
func pargs (string text, ...) do echo "text=%text" loop for number i 1, while i < $# - @text, set i i + 1 do echo "arg %i=" . $(i) done done
Note how the ordinal number operator is used to compute the upper limit.
As another example, the function below computes the sum of its arguments.
func sum(number ...) do number s 0 loop for number i 1, while i <= $#, set i i + 1 do set s s + $(i) done return s done
Sometimes it is necessary to pass all variable arguments passed to a
variadic function on to another variadic function. To do so, use the
$@
operator. For example:
func y(string x, number ...) do echo "x is " . sum($@) done
Suppose y
is called as y("test", 1, 3, 5)
. Then, it
will call sum
as: sum(1, 3, 5)
.
The $@
can be used with a numeric argument, which indicates
number of arguments to remove from the resulted argument list. This
is similar to shift
statement in other languages. Thus, if
y
in the above example were written as:
func y(string x, ...) do x($@(2)) done
then y("test", "a", "b", "c")
, it will call x
as
follows: x("c")
.
Notice the following important points. First, $@
can be
used only as the last argument in the argument list. Secondly, it
cannot be used to pass mandatory and optional arguments to a function.
In other words, arguments passed via $@
must correspond to
ellipsis in the function declaration. Finally, passing shift count
greater than the actual number of variable arguments results in a
runtime error.
The function-body is any list of valid mailfromd
statements. In addition to the statements discussed below
(see Statements) it can also contain the return
statement,
which is used to return a value from the function. The syntax of the
return statement is
return value
As an example of this, consider the following code snippet that defines the function ‘sum’ to return a sum of its two arguments:
func sum(number x, number y) returns number do return x + y done
The returns
part in the function declaration is optional. A
declaration lacking it defines a procedure, or void
function, i.e. a function that is not supposed to return any value.
Such functions cannot be used in expressions, instead they are
used as statements (see Statements). The following example
shows a function that emits a customized temporary failure notice:
func stdtf() do tempfail 451 4.3.5 "Try again later" done
A function may have several names. An alternative name (or
alias) can be assigned to a function by using alias
keyword, placed after param-decl part, for example:
func foo() alias bar returns string do … done
After this declaration, both foo()
and bar()
will refer
to the same function.
The number of function aliases is unlimited. The following fragment declares a function having three names:
func foo() alias bar alias baz returns string do … done
Although this feature is rarely needed, there are sometimes cases when it may be necessary.
A variable declared within a function becomes a local variable to
this function. Its lexical scope ends with the terminating
done
statement.
Parameters, local variables and global variables are using separate namespaces, so a parameter name can coincide with the name of a global, in which case a parameter is said to shadow the global. All references to its name will refer to the parameter, until the end of its scope is reached, where the global one becomes visible again. Consider the following example:
number x func foo(string x) do echo "foo: %x" done prog envfrom do set x "Global" foo("Local") echo x done
Running mailfromd --test
with this configuration will
display:
foo: Local Global |
• Some Useful Functions |
To illustrate the concept of user-defined functions, this subsection
shows the definitions of some of the library functions shipped with
mailfromd
14.
These functions are contained in modules installed along with the
mailfromd
binary. To use any of them in your code, require
the appropriate module as described in import, e.g. to use the
revip
function, do require 'revip'
.
Functions and their definitions:
revip
The function revip
, that was used in releases of
mailfromd
up to 9.0 (see revip) was implemented as follows:
func revip(string ip) returns string do return inet_ntoa(ntohl(inet_aton(ip))) done
Previously it was implemented using regular expressions. Below we include this variant as well, as an illustration for the use of regular expressions:
#pragma regex push +extended func revip(string ip) returns string do if ip matches '([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)' return "\4.\3.\2.\1" fi return ip done #pragma regex pop
strip_domain_part
This function returns at most n last components of the domain name domain (see strip_domain_part).
#pragma regex push +extended func strip_domain_part(string domain, number n) returns string do if n > 0 and domain matches '.*((\.[^.]+){' . $2 . '})' return substring(\1, 1, -1) else return domain fi done #pragma regex pop
valid_domain
See valid_domain, for a description of this function. Its definition follows:
require dns func valid_domain(string domain) returns number do return not (resolve(domain) = "0" and not hasmx(domain)) done
match_dnsbl
The function match_dnsbl
(see match_dnsbl) is defined as
follows:
require dns require match_cidr #pragma regex push +extended func match_dnsbl(string address, string zone, string range) returns number do string rbl_ip if range = 'ANY' set rbl_ip '127.0.0.0/8' else set rbl_ip range if not range matches '^([0-9]{1,3}\.){3}[0-9]{1,3}$' return 0 fi fi if not (address matches '^([0-9]{1,3}\.){3}[0-9]{1,3}$' and address != range) return 0 fi if address matches '^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$' if match_cidr (resolve ("\4.\3.\2.\1", zone), rbl_ip) return 1 else return 0 fi fi # never reached done
Expressions are language constructs, that evaluate to a value, that can subsequently be echoed, tested in a conditional statement, assigned to a variable or passed to a function.
• Constant expressions | String and Numeric Constants. | |
• Function calls | A Function Call is an Expression. | |
• Concatenation | String Concatenation. | |
• Arithmetic operations | ‘+’, ‘-’, etc. | |
• Bitwise shifts | ‘<<’ and ‘>>’. | |
• Relational expressions | ‘=’, ‘<’, etc. | |
• Special comparisons | matches , mx matches , etc.
| |
• Boolean expressions | and , or , not .
| |
• Precedence | How various operators nest. | |
• Type casting |
Next: Function calls, Up: Expressions [Contents][Index]
Literals and numbers are constant expressions. They evaluate to string and numeric types.
Next: Concatenation, Previous: Constant expressions, Up: Expressions [Contents][Index]
A function call is an expression. Its type is the return type of the function.
Next: Arithmetic operations, Previous: Function calls, Up: Expressions [Contents][Index]
Concatenation operator is ‘.’ (a dot). For example, if
$f
is ‘smith’, and $client_addr
is
‘10.10.1.1’, then:
$f . "-" . $client_addr ⇒ "smith-10.10.1.1"
Any two adjacent literal strings are concatenated, producing a new string, e.g.
"GNU's" " not " "UNIX" ⇒ "GNU's not UNIX"
Next: Bitwise shifts, Previous: Concatenation, Up: Expressions [Contents][Index]
The filter script language offers the common arithmetic operators: ‘+’, ‘-’, ‘*’ and ‘/’. In addition, the ‘%’ is a modulo operator, i.e. it computes the remainder of division of its operands.
All of them follow usual precedence rules and work as you would expect them to.
Next: Relational expressions, Previous: Arithmetic operations, Up: Expressions [Contents][Index]
The ‘<<’ represents a bitwise shift left operation, which shifts the binary representation of the operand on its left by the number of bits given by the operand on its right.
Similarly, the ‘>>’ represents a bitwise shift right.
Next: Special comparisons, Previous: Bitwise shifts, Up: Expressions [Contents][Index]
Relational expressions are:
Expression | Result |
---|---|
x < y | True if x is less than y. |
x <= y | True if x is less than or equal to y. |
x > y | True if x is greater than y. |
x >= y | True if x is greater than or equal to y. |
x = y | True if x is equal to y. |
x != y | True if x is not equal to y. |
The relational expressions apply to string as well as to numbers. When a relational operation applies to strings, case-sensitive comparison is used, e.g.:
"String" = "string" ⇒ False "String" < "string" ⇒ True
Next: Boolean expressions, Previous: Relational expressions, Up: Expressions [Contents][Index]
In addition to the traditional relational operators, described
above, mailfromd
provides two operators for regular
expression matching:
Expression | Result |
---|---|
x matches y | True if the string x matches the regexp denoted by y. |
x fnmatches y | True if the string x matches the globbing pattern denoted by y. |
The type of the regular expression used by matches
operator
is controlled by #pragma regex
(see pragma regex). For example:
$f ⇒ "gray@gnu.org.ua" $f matches '.*@gnu\.org\.ua' ⇒true
$f matches '.*@GNU\.ORG\.UA' ⇒false
#pragma regex +icase $f matches '.*@GNU\.ORG\.UA' ⇒true
The fnmatches
operator compares its left-hand operand with a
globbing pattern (see glob(7)) given as its right-hand side
operand. For example:
$f ⇒ "gray@gnu.org.ua" $f fnmatches "*ua" ⇒true
$f fnmatches "*org" ⇒false
$f fnmatches "*org*" ⇒true
Both operators have a special form, for ‘MX’ pattern matching. The expression:
x mx matches y
is evaluated as follows: first, the expression x is analyzed and, if it is an email address, its domain part is selected. If it is not, its value is used verbatim. Then the list of ‘MX’s for this domain is looked up. Each of ‘MX’ names is then compared with the regular expression y. If any of the names matches, the expression returns true. Otherwise, its result is false.
Similarly, the expression:
x mx fnmatches y
returns true only if any of the ‘MX’s for (domain or email) x match the globbing pattern y.
Both mx matches
and mx fnmatches
can signal the
following exceptions: e_temp_failure
, e_failure
.
The value of any parenthesized subexpression occurring within the
right-hand side argument to matches
or mx matches
can be
referenced using the notation ‘\d’, where d is the
ordinal number of the subexpression (subexpressions are numbered from
left to right, starting at 1). This notation is allowed in the
program text as well as within double-quoted strings and
here-documents, for example:
if $f matches '.*@\(.*\)\.gnu\.org\.ua' set message "Your host name is \1;" fi
Remember that the grouping symbols are ‘\(’ and ‘\)’ for basic regular expressions, and ‘(’ and ‘)’ for extended regular expressions. Also make sure you properly escape all special characters (backslashes in particular) in double-quoted strings, or use single-quoted strings to avoid having to do so (see singe-vs-double, for a comparison of the two forms).
Next: Precedence, Previous: Special comparisons, Up: Expressions [Contents][Index]
A boolean expression is a combination of relational or
matching expressions using the boolean operators and
, or
and not
, and, eventually, parentheses to control nesting:
Expression | Result |
---|---|
x and y | True only if both x and y are true. |
x or y | True if any of x or y is true. |
not x | True if x is false. |
Binary boolean expressions are computed using shortcut evaluation:
x and y
If x ⇒
, the result is false
false
and y is not evaluated.
x or y
If x ⇒
, the result is true
true
and
y is not evaluated.
Next: Type casting, Previous: Boolean expressions, Up: Expressions [Contents][Index]
Operator precedence is an abstract value associated with each
language operator, that determines the order in which operators are
executed when they appear together within a single expression.
Operators with higher precedence are executed first. For example,
‘*’ has a higher precedence than ‘+’, therefore the
expression a + b * c
is evaluated in the following order: first
b
is multiplied by c
, then a
is added to the
product.
When operators of equal precedence are used together they are evaluated from left to right (i.e., they are left-associative), except for comparison operators, which are non-associative (these are explicitly marked as such in the table below). This means that you cannot write:
if 5 <= x <= 10
Instead, you should write:
if 5 <= x and x <= 10
The precedence of the mailfromd
operators where selected
so as to match that used in most programming languages.15
The following table lists all operators in order of decreasing precedence:
(...)
Grouping
$ %
Sendmail
macros and mailfromd
variables
* /
Multiplication, division
+ -
Addition, subtraction
<< >>
Bitwise shift left and right
< <= >= >
Relational operators (non-associative)
= != matches fnmatches
Equality and special comparison (non-associative)
&
Logical (bitwise) AND
^
Logical (bitwise) XOR
|
Logical (bitwise) OR
not
Boolean negation
and
Logical ‘and’.
or
Logical ‘or’
.
String concatenation
Previous: Precedence, Up: Expressions [Contents][Index]
When two operands on each side of a binary expression have
different type, mailfromd
evaluator coerces them to a
common type. This is known as implicit type casting. The rules
for implicit type casting are:
The construct for explicit type cast is:
type(expr)
where type is the name of the type to coerce expr to. For example:
string(2 + 4*8) ⇒ "34"
A special case of type casting is cast to void
. It is used to
ignore return value of a function call between the braces, e.g.:
void(dlcall(libh, "extlog", "s", text))
Next: Statements, Previous: Expressions, Up: MFL [Contents][Index]
When any two named entities happen to have the same name we say that a name clash occurs. The handling of name clashes depends on types of the entities involved in it.
A name of a constant or variable can coincide with that of a function, it does not produce any warnings or errors because functions, variables and constants use different namespaces. For example, the following code is correct:
const a 4 func a() do echo a done
When executed, it prints ‘4’.
Redefinition of a function or using a predefined handler name (see Handlers) as a function name results in a fatal error. For example, compiling this code:
func a() do echo "1" done func a() do echo "2" done
causes the following error message:
mailfromd: sample.mfl:9: syntax error, unexpected FUNCTION_PROC, expecting IDENTIFIER
A variable name can coincide with a handler name. For example, the following code is perfectly OK:
string envfrom "M" prog envfrom do echo envfrom done
If two handlers with the same name are defined, the definition that appears further in the source text replaces the previous one. A warning message is issued, indicating locations of both definitions, e.g.:
mailfromd: sample.mfl:116: Warning: Redefinition of handler `envfrom' mailfromd: sample.mfl:34: Warning: This is the location of the previous definition
Defining a variable having the same name as an already defined one results in a warning message being displayed. The compilation succeeds. The second variable shadows the first, that is any subsequent references to the variable name will refer to the second variable. For example:
string x "Text" number x 1 prog envfrom do echo x done
Compiling this code results in the following diagnostics:
mailfromd: sample.mfl:4: Redeclaring `x' as different data type mailfromd: sample.mfl:2: This is the location of the previous definition
Executing it prints ‘1’, i.e. the value of the last definition of
x
.
The scope of the shadowing depends on storage classes of the two variables. If both of them have external storage class (i.e. are global ones), the shadowing remains in effect until the end of input. In other words, the previous definition of the variable is effectively forgotten.
If the previous definition is a global, and the shadowing definition is an automatic variable or a function parameter, the scope of this shadowing ends with the scope of the second variable, after which the previous definition (global) becomes visible again. Consider the following code:
set x "initial" func foo(string x) returns string do return x done prog envfrom do echo foo("param") echo x done
Its compilation produces the following warning:
mailfromd: sample.mfl:3: Warning: Parameter `x' is shadowing a global
When executed, it produces the following output:
param initial State envfrom: continue
If a constant is defined which has the same name as a previously defined variable (the constant shadows the variable), the compiler prints the following diagnostic message:
file:line: Warning: Constant name `name' clashes with a variable name file:line: Warning: This is the location of the previous definition
A similar diagnostics is issued if a variable is defined whose name coincides with a previously defined constant (the variable shadows the constant).
In any case, any subsequent notation %name refers to the last defined symbol, be it variable or constant.
Notice, that shadowing occurs only when using %name notation. Referring to the constant using its name without ‘%’ allows to avoid shadowing effects.
If a variable shadows a constant, the scope of the shadowing depends
on the storage class of the variable. For automatic variables and
function parameters, it ends with the final done
closing the
function. For global variables, it lasts up to the end of input.
For example, consider the following code:
const a 4 func foo(string a) do echo a done prog envfrom do foo(10) echo a done
When run, it produces the following output:
$ mailfromd --test sample.mfl mailfromd: sample.mfl:3: Warning: Variable name `a' clashes with a constant name mailfromd: sample.mfl:1: Warning: This is the location of the previous definition 10 4 State envfrom: continue
Redefining a constant produces a warning message. The latter definition shadows the former. Shadowing remains in effect until the end of input.
Next: Conditionals, Previous: Shadowing, Up: MFL [Contents][Index]
Statements are language constructs, that, unlike expressions, do not return any value. Statements execute some actions, such as assigning a value to a variable, or serve to control the execution flow in the program.
• Actions | Actions control the handling of the mail. | |
• Assignments | ||
• Pass | ||
• Echo |
Next: Assignments, Up: Statements [Contents][Index]
An action statement instructs mailfromd
to
perform a certain action over the message being processed. There are
two kinds of actions: return actions and header manipulation actions.
Reply actions tell Sendmail
to return given response code
to the remote party. There are five such actions:
accept
Return an accept
reply. The remote party will continue
transmitting its message.
reject code excode message-expr
reject (code-expr, excode-expr, message-expr)
Return a reject
reply. The remote party will have to
cancel transmitting its message. The three arguments are optional,
their usage is described below.
tempfail code excode message
tempfail (code-expr, excode-expr, message-expr)
Return a ‘temporary failure’ reply. The remote party can retry to send its message later. The three arguments are optional, their usage is described below.
discard
Instructs Sendmail
to accept the message and silently discard
it without delivering it to any recipient.
continue
Stops the current handler and instructs Sendmail
to
continue processing of the message.
Two actions, reject
and tempfail
can take up to three
optional parameters. There are two forms of supplying these
parameters.
In the first form, called literal or traditional notation,
the arguments are supplied as additional words after the action name,
and are separated by whitespace. The first argument is a three-digit
RFC 2821 reply code. It must begin with ‘5’ for
reject
and with ‘4’ for tempfail
. If two arguments
are supplied, the second argument must be either an extended
reply code (RFC 1893/2034) or a textual string to be
returned along with the SMTP reply. Finally, if all three
arguments are supplied, then the second one must be an extended reply
code and the third one must give the textual string. The following
examples illustrate the possible ways of using the reject
statement:
reject reject 503 reject 503 5.0.0 reject 503 "Need HELO command" reject 503 5.0.0 "Need HELO command"
Used without arguments, reject
is equivalent to
reject 550
and tempfail
to
tempfail 451
In literal notation, the values of code and extendended code (if supplied) must be literal strings. The third argument (textual message) can be either a literal string or MFL expression that evaluates to string.
The second form of supplying arguments is called functional notation, because it resembles the function syntax. When used in this form, the action word is followed by a parenthesized group of exactly three arguments, separated by commas. Each argument is a MFL expression. The meaning and ordering of the arguments is the same as in literal form. Any or all of these three arguments may be absent, in which case the corresponding default value will be used16. To illustrate this, here are the statements from the previous example, written in functional notation:
reject(,,) reject(503,,) reject(503, 5.0.0) reject(503, , "Need HELO command") reject(503, 5.0.0, "Need HELO command")
Notice that there is an important difference between the two notations. The functional notation allows to compute both reply codes at run time, e.g.:
reject(500 + dig2*10 + dig3, "5.%edig2.%edig2")
Header manipulation actions provide basic means to add, delete or modify the message RFC 2822 headers.
add name string
Add the header name with the value string. E.g.:
add "X-Seen-By" "Mailfromd 9.0"
(notice argument quoting)
replace name string
The same as add
, but if the header name already
exists, it will be removed first, for example:
replace "X-Last-Processor" "Mailfromd 9.0"
delete name
Delete the header named name:
delete "X-Envelope-Date"
These actions impose some restrictions. First of all, their first argument must be a literal string (not a variable or expression). Secondly, there is no way to select a particular header instance to delete or replace, which may be necessary to properly handle multiple headers (e.g. ‘Received’). For more elaborate ways of header modifications, see Header modification functions.
Next: Pass, Previous: Actions, Up: Statements [Contents][Index]
An assignment is a special statement that assigns a value to the variable. It has the following syntax:
set name value
where name is the variable name and value is the value to be assigned to it.
Assignment statements can appear in any part of a filter program.
If an assignment occurs outside of function or handler definition,
the value must be a literal value (see Literals). If it
occurs within a function or handler definition, value can be any
valid mailfromd
expression (see Expressions). In this
case, the expression will be evaluated and its value will be assigned
to the variable. For example:
set delay 150 prog envfrom do set delay delay * 2 … done
Next: Echo, Previous: Assignments, Up: Statements [Contents][Index]
pass
statementThe pass
statement has no effect. It is used in places
where no statement is needed, but the language syntax requires one:
on poll $f do when success: pass when not_found or failure: reject 550 done
Previous: Pass, Up: Statements [Contents][Index]
echo
statementThe echo
statement concatenates all its arguments into a single
string and sends it to the syslog
using the priority
‘info’. It is useful for debugging your script, in
conjunction with built-in constants (see Built-in constants), for
example:
func foo(number x) do echo "%__file__:%__line__: foo called with arg %x" … done
Next: Loops, Previous: Statements, Up: MFL [Contents][Index]
Conditional expressions, or conditionals for short, test some conditions and alter the control flow depending on the result. There are two kinds of conditional statements: if-else branches and switch statements.
The syntax of an if-else branching construct is:
if condition then-body [else else-body] fi
Here, condition is an expression that governs control flow
within the statement. Both then-body and else-body are
lists of mailfromd
statements. If condition is
true, then-body is executed, if it is false, else-body is
executed. The ‘else’ part of the statement is optional. The
condition is considered false if it evaluates to zero, otherwise it is
considered true. For example:
if $f = "" accept else reject fi
This will accept the message if the value of the Sendmail
macro $f
is an empty string, and reject it otherwise. Both
then-body and else-body can be compound statements
including other if
statements. Nesting level of
conditional statements is not limited.
To facilitate writing complex conditional statements, the elif
keyword can be used to introduce alternative conditions, for example:
if $f = "" accept elif $f = "root" echo "Mail from root!" else reject fi
Another type of branching instruction is switch
statement:
switch condition do case x1 [or x2 …]: stmt1 case y1 [or y2 …]: stmt2 . . . [default: stmt] done
Here, x1, x2, y1, y2 are literal expressions;
stmt1, stmt2 and stmt are arbitrary
mailfromd
statements (possibly compound); condition is
the controlling expression. The vertical dotted row represent another
eventual ‘case’ branches.
This statement is executed as follows: the condition
expression is evaluated and if its value equals x1 or x2
(or any other x from the first case
), then
stmt1 is executed. Otherwise, if condition evaluates
to y1 or y2 (or any other y from the second
case
), then stmt2 is executed. Other case
branches are tried in turn. If none of them matches, stmt
(called the default branch) is executed.
There can be as many case
branches as you wish. The
default
branch is optional. There can be at most one
default
branch.
An example of switch
statement follows:
switch x do case 1 or 3: add "X-Branch" "1" accept case 2 or 4 or 6: add "X-Branch" "2" default: reject done
If the value of mailfromd
variable x
is 2 or 3,
it will accept the message immediately, and add a ‘X-Branch: 1’
header to it. If x
equals 2 or 4 or 6, this code will add
‘X-Branch: 2’ header to the message and will continue processing
it. Otherwise, it will reject the message.
The controlling condition of a switch
statement may evaluate
to numeric or string type. The type of the condition governs the
type of comparisons used in case
branches: for numeric types,
numeric equality will be used, whereas for string types, string
equality is used.
Next: Exceptions, Previous: Conditionals, Up: MFL [Contents][Index]
The loop statement allows for repeated execution of a block of code, controlled by some conditional expression. It has the following form:
loop [label] [for stmt1] [,while expr1] [,stmt2] do stmt3 done [while expr2]
where stmt1, stmt2, and stmt3 are statement lists, expr1 and expr2 are expressions.
The control flow is as follows:
Thus, stmt3 is executed until either expr1 or expr2 yield a zero value.
The loop body – stmt3 – can contain special statements:
break [label]
Terminates the loop immediately. Control passes to ‘6’ (End) in the formal definition above. If label is supplied, the statement terminates the loop statement marked with that label. This allows to break from nested loops.
It is similar to break
statement in C or shell.
next [label]
Initiates next iteration of the loop. Control passes to ‘4’ in the formal definition above. If label is supplied, the statement starts next iteration of the loop statement marked with that label. This allows to request next iteration of an upper-level loop from a nested loop statement.
The loop
statement can be used to create iterative statements
of arbitrary complexity. Let’s illustrate it in comparison with C.
The statement:
loop do stmt-list done
creates an infinite loop. The only way to exit from such a loop is to
call break
(or return
, if used within a function),
somewhere in stmt-list.
The following statement is equivalent to while (expr1)
stmt-list
in C:
loop while expr do stmt-list done
The C construct for (expr1; expr2; expr3)
is written in MFL as follows:
loop for stmt1, while expr2, stmt2 do stmt3 done
For example, to repeat stmt3 10 times:
loop for set i 0, while i < 10, set i i + 1 do stmt3 done
Finally, the C ‘do’ loop is implemented as follows:
loop do stmt-list done while expr
As a real-life example of a loop statement, let’s consider the
implementation of function ptr_validate
, which takes a single
argument ipstr, and checks its validity using the following algorithm:
Perform a DNS reverse-mapping for ipstr, looking up the
corresponding PTR
record in ‘in-addr.arpa’. For each record
returned, look up its IP addresses (A records). If ipstr is
among the returned IP addresses, return 1 (true
), otherwise
return 0 (false
).
The implementation of this function in MFL is:
#pragma regex push +extended func ptr_validate(string ipstr) returns number do loop for string names dns_getname(ipstr) . " " number i index(names, " "), while i != -1, set names substr(names, i + 1) set i index(names, " ") do loop for string addrs dns_getaddr(substr(names, 0, i)) . " " number j index(addrs, " "), while j != -1, set addrs substr(addrs, j + 1) set j index(addrs, " ") do if ipstr == substr(addrs, 0, j) return 1 fi done done return 0 done
When the running program encounters a condition it is not able to handle, it signals an exception. To illustrate the concept, let’s consider the execution of the following code fragment:
if primitive_hasmx(domainpart($f)) accept fi
The function primitive_hasmx
(see primitive_hasmx) tests whether the
domain name given as its argument has any ‘MX’ records. It should
return a boolean value. However, when querying the Domain Name
System, it may fail to get a definite result. For example, the DNS
server can be down or temporary unavailable. In other words,
primitive_hasmx
can be in a situation when, instead of returning
‘yes’ or ‘no’, it has to return ‘don't know’. It has
no way of doing so, therefore it signals an exception.
Each exception is identified by exception type, an integer number associated with it.
• Built-in Exceptions | ||
• User-defined Exceptions | ||
• Catch and Throw |
Next: User-defined Exceptions, Up: Exceptions [Contents][Index]
The first 22 exception numbers are reserved for
built-in exceptions. These are declared in module status.mfl.
The following table summarizes all built-in exception types implemented by
mailfromd
version 9.0. Exceptions are listed in
lexicographic order.
The called function cannot finish its task because an incompatible message modification function was called at some point before it. For details, MMQ and dkim_sign.
General database failure. For example, the database cannot be opened. This exception can be signaled by any function that queries any DBM database.
Division by zero.
This exception is emitted by dbinsert
built-in if the
requested key is already present in the database (see dbinsert).
Function reached end of file while reading. See I/O functions, for a description of functions that can signal this exception.
A general failure has occurred. In particular, this exception is
signaled by DNS lookup functions when any permanent failure occurs.
This exception can be signaled by any DNS-related function
(hasmx
, poll
, etc.) or operation (mx matches
).
Invalid input format. This exception is signaled if input data to a
function are improperly formatted. In version 9.0 it is
signaled by message_burst
function if its input message is not
formatted according to RFC 934. See Message digest functions.
Illegal byte sequence. Signaled when a string cannot be converted between character sets because a sequence of bytes was encountered that is not defined for the source character set or cannot be represented in the destination character set.
See MIME decoding, for details.
Arguments supplied to a function are invalid.
Invalid CIDR notation. This is signaled by
match_cidr
function when its second argument is not a valid
CIDR.
Invalid IP address. This is signaled by match_cidr
function
when its first argument is not a valid IP address.
Invalid time interval specification. It is signaled by
interval
function if its argument is not a valid time interval
(see time interval specification).
An error occurred during the input-output operation. See I/O functions, for a description of functions that can signal this exception.
A Sendmail macro is undefined.
Required entity is not found. It is raised, for example, by
message_find_header
, when the requested header is not present
in the message and by DNS resolver functions when unable to
resolve host name or IP address.
The supplied argument is outside the allowed range. This is
signalled, for example, by substring
function (see substring).
Regular expression cannot be compiled. This can happen when a
regular expression (a right-hand argument of a matches
operator) is built at the runtime and the produced string is an
invalid regex.
String-to-number conversion failed. This can be signaled when a string is used in numeric context which cannot be converted to the numeric data type. For example:
set x "10a" set y x / 2
In this code fragment, line 2 will raise the e_ston_conv
exception, since ‘10a’ cannot be converted to a number.
This is not an exception in the strict sense of the word, but a constant indicating success.
A temporary failure has occurred. This can be signaled by DNS-related functions or operations.
Raised by various DNS functions when they encounter a long chain of CNAME records when trying to resolve a hostname. See CNAME chains.
The supplied URL is invalid. See Interfaces to Third-Party Programs.
Next: Catch and Throw, Previous: Built-in Exceptions, Up: Exceptions [Contents][Index]
You can define your own exception types using the dclex
statement:
dclex type
In this statement, type must be a valid MFL
identifier, not used for another constant (see Constants).
The dclex
statement defines a new exception identified by
the constant type and allocates a new exception number for it.
The type can subsequently be used in throw
and
catch
statements, for example:
dclex myrange number fact(number val) returns number do if val < 0 throw myrange "fact argument is out of range" fi … done
Previous: User-defined Exceptions, Up: Exceptions [Contents][Index]
Normally when an exception is signalled, the program execution is
terminated and the MTA is returned a tempfail
status. Additional information regarding the exception is then output
to the logging channel (see Logging and Debugging). However, the
user can intercept any exception by installing his own
exception-handling routines.
An exception-handling routine is introduced by a try–catch statement, which has the following syntax:
try do stmtlist done catch exception-list do handler-body done
where stmtlist and handler-body are sequences of
MFL statements and exception-list is the list of
exception types, separated by the word or
. A special
exception-list ‘*’ is allowed and means all exceptions.
This construct works as follows. First, the statements from stmtlist are executed. If the execution finishes successfully, control is passed to the first statement after the ‘catch’ block. Otherwise, if an exception is signalled and this exception is listed in exception-list, the execution is passed to the handler-body. If the exception is not listed in exception-list, it is handled as usual.
The following example shows a ‘try--catch’ construct used for
handling eventual exceptions, signalled by primitive_hasmx
.
try do if primitive_hasmx(domainpart($f)) accept else reject fi done catch e_failure or e_temp_failure do echo "primitive_hasmx failed" continue done
The ‘try--catch’ statement can appear anywhere inside a function or a handler, but it cannot appear outside of them. It can also be nested within another ‘try--catch’, in either of its parts. Upon exit from a function or milter handler, all exceptions are restored to the state they had when it has been entered.
A catch
block can also be used alone, without preceding try
part. Such a construct is called a standalone catch. It is
mostly useful for setting global exception handlers in a begin
statement (see begin/end). When used within a usual function or
handler, the exception handlers set by a standalone catch
remain in force until either another standalone catch appears further
in the same function or handler, or an end of the function is
encountered, whichever occurs first.
A standalone catch defined within a function must return from
it by executing return
statement. If it does not do that
explicitly, the default value of 1 is returned. A standalone catch
defined within a milter handler must end execution with any of the
following actions: accept
, continue
, discard
,
reject
, tempfail
. By default, continue
is
used.
It is not recommended to mix ‘try--catch’ constructs and standalone catches. If a standalone catch appears within a ‘try--catch’ statement, its scope of visibility is undefined.
Upon entry to a handler-body, two implicit positional arguments
are defined, which can be referenced in handler-body as $1
and $2
17. The first argument gives the
numeric code of the exception that has occurred. The second argument
is a textual string containing a human-readable description of the exception.
The following is an improved version of the previous example, which uses these parameters to supply more information about the failure:
try do if primitive_hasmx(domainpart($f)) accept else reject fi done catch e_failure or e_temp_failure do echo "Caught exception $1: $2" continue done
The following example defines the function hasmx
that
returns true if the domain part of its argument has any ‘MX’ records, and
false if it does not or if an exception occurs 18.
func hasmx (string s) returns number do try do return primitive_hasmx(domainpart(s)) done catch * do return 0 done done
The same function can written using standalone catch
:
func hasmx (string s) returns number do catch * do return 0 done return primitive_hasmx(domainpart(s)) done
All variables remain visible within catch
body, with the
exception of positional arguments of the enclosing handler. To access
positional arguments of a handler from the catch
body, assign
them to local variables prior to the ‘try--catch’ construct, e.g.:
prog header do string hname $1 string hvalue $2 try do … done catch * do echo "Exception $1 while processing header %hname: %hvalue" echo $2 tempfail done
You can also generate (or raise) exceptions explicitly in the
code, using throw
statement:
throw excode descr
The arguments correspond exactly to the positional parameters of the
catch
statement: excode gives the numeric code of the
exception, descr gives its textual description. This statement
can be used in complex scripts to create non-local exits from deeply
nested statements.
Notice, that the the excode argument must be an immediate
value: an exception identifier (either a built-in one or one declared
previously using a dclex
statement).
Next: Modules, Previous: Exceptions, Up: MFL [Contents][Index]
The filter script language provides a wide variety of functions for
sender address verification or polling, for short. These
functions, which were described in SMTP Callout functions, can be
used to implement any sender verification method. The additional data
that can be needed is normally supplied by two global variables:
ehlo_domain
, keeping the default domain for the EHLO
command, and mailfrom_address
, which stores the sender address
for probe messages (see Predefined variables).
For example, a simplest way to implement standard polling would be:
prog envfrom do if stdpoll($1, ehlo_domain, mailfrom_address) == 0 accept else reject 550 5.1.0 "Sender validity not confirmed" fi done
However, this does not take into account exceptions that
stdpoll
can signal. To handle them, one will have to use
catch
, for example thus:
require status prog envfrom do try do if stdpoll($1, ehlo_domain, mailfrom_address) == 0 accept else reject 550 5.1.0 "Sender validity not confirmed" fi done catch e_failure or e_temp_failure do switch $1 do case failure: reject 550 5.1.0 "Sender validity not confirmed" case temp_failure: tempfail 450 4.1.0 "Try again later" done done done
If polls are used often, one can define a wrapper function, and use it instead. The following example illustrates this approach:
func poll_wrapper(string email) returns number do catch e_failure or e_temp_failure do return email done return stdpoll(email, ehlo_domain, mailfrom_address) done prog envfrom do switch poll_wrapper($f) do case success: accept case not_found or failure: reject 550 5.1.0 "Sender validity not confirmed" case temp_failure: tempfail 450 4.1.0 "Try again later" done done
Notice the way envfrom
handles success
and
not_found
, which are not exceptions in the strict sense of the
word.
The above paradigm is so common that mailfromd
provides a
special language construct to simplify it: the on
statement.
Instead of manually writing the wrapper function and using it as a
switch
condition, you can rewrite the above example as:
prog envfrom do on stdpoll($1, ehlo_domain, mailfrom_address) do when success: accept when not_found or failure: reject 550 5.1.0 "Sender validity not confirmed" when temp_failure: tempfail 450 4.1.0 "Try again later" done done
As you see the statement is pretty similar to switch
. The
major syntactic difference is the use of the keyword when
to
introduce conditional branches.
General syntax of the on
statement is:
on condition do when x1 [or x2 …]: stmt1 when y1 [or y2 …]: stmt2 . . . done
The condition is either a function call or a special poll
statement (see below). The values used in when
branches are
normally symbolic exception names (see exception names).
When the compiler processes the on
statement it does the
following:
when
branches; To avoid name clashes with
the user-defined functions, the wrapper name begins and ends with
‘$’ which normally is not allowed in the identifiers;
on
body to the corresponding switch
statement;
A special form of the condition is poll
keyword,
whose syntax is:
poll [for] email [host host] [from domain] [as email]
The order of particular keywords in the poll
statement is
arbitrary, for example as email
can appear before
email as well as after it.
The simplest form, poll email
, performs the standard
sender verification of email address email. It is translated
to the following function call:
stdpoll(email, ehlo_domain, mailfrom_address)
The construct poll email host host
, runs the
strict sender verification of address email on the given host.
It is translated to the following call:
strictpoll(host, email, ehlo_domain, mailfrom_address)
Other keywords of the poll
statement modify these two basic
forms. The as
keyword introduces the email address to be used
in the SMTP MAIL FROM
command, instead of
mailfrom_address
. The from
keyword sets the domain
name to be used in EHLO
command. So, for example the following
construct:
poll email host host from domain as addr
is translated to
strictpoll(host, email, domain, addr)
To summarize the above, the code described in Figure 4.2 can be written as:
prog envfrom do on poll $f do when success: accept when not_found or failure: reject 550 5.1.0 "Sender validity not confirmed" when temp_failure: tempfail 450 4.1.0 "Try again later" done done
A module is a logically isolated part of code that implements a separate concern or feature and contains a collection of conceptually united functions and/or data. Each module occupies a separate compilation unit (i.e. file). The functionality provided by a module is incorporated into another module or the main program by requiring this module or by importing the desired components from it.
• module structure | Declaring Modules | |
• scope of visibility | ||
• import | Require and Import |
Next: scope of visibility, Up: Modules [Contents][Index]
A module file must begin with a module declaration:
module modname [interface-type].
Note the final dot.
The modname parameter declares the name of the module. It is recommended that it be the same as the file name without the ‘.mfl’ extension. The module name must be a valid MFL literal. It also must not coincide with any defined MFL symbol, therefore we recommend to always quote it (see example below).
The optional parameter interface-type defines the default scope of visibility for the symbols declared in this module. If it is ‘public’, then all symbols declared in this module are made public (importable) by default, unless explicitly declared otherwise (see scope of visibility). If it is ‘static’, then all symbols, not explicitly marked as public, become static. If the interface-type is not given, ‘public’ is assumed.
The actual MFL code follows the ‘module’ line.
The module definition is terminated by the logical end of its
compilation unit, i.e. either by the end of file, or by the
keyword bye
, whichever occurs first.
Special keyword bye
may be used to prematurely end the current
compilation unit before the physical end of the containing file.
Any material between bye
and the end of file is ignored by the
compiler.
Let’s illustrate these concepts by writing a module ‘revip’:
module 'revip' public. func revip(string ip) returns string do return inet_ntoa(ntohl(inet_aton(ip))) done bye This text is ignored. You may put any additional documentation here.
Next: import, Previous: module structure, Up: Modules [Contents][Index]
Scope of Visibility of a symbol defines from where this symbol may be referred to. Symbols in MFL may have either of the following two scopes:
Public symbols are visible from the current module, as well as from any external modules, including the main script file, provided that they are properly imported (see import).
Static symbols are visible only from the current module. There is no way to refer to them from outside.
The default scope of visibility for all symbols declared within
a module is defined in the module declaration (see module structure). It may be overridden for any individual symbol by
prefixing its declaration with an appropriate qualifier: either
public
or static
.
Previous: scope of visibility, Up: Modules [Contents][Index]
Functions or variables declared in another module must be imported prior to their actual use. MFL provides two ways of doing so: by requiring the entire module or by importing selected symbols from it.
Modules are looked up in the module search path. The default module search path consists of two directories:
where prefix stands for the installation prefix (normally /usr or /usr/local).
Module search path can be changed in the configuration file,
using the module-path
statement (see module-path), or from the command line, using the -P
(--module-path) option (see --module-path).
The require
statement instructs the compiler to locate the
module modname and to load all public interfaces from it.
The compiler looks for the file modname.mfl in the module search path. If no such file is found, a compilation error is reported.
For example, the following statement:
require revip
imports all interfaces from the module revip.mfl.
Another, more sophisticated way to import from a module is to use the ‘from ... import’ construct:
from module import symbols.
Note the final dot. The ‘from’ and ‘module’ statements are the only two constructs in MFL that require the delimiter.
The module has the same semantics as in the require
construct. The symbols is a comma-separated list of symbol
names to import from module. A symbol name may be given in
several forms:
Literals specify exact symbol names to import. For example, the following statement imports from module A.mfl symbols ‘foo’ and ‘bar’:
from A import foo,bar.
Regular expressions must be surrounded by slashes. A regular expression instructs the compiler to import all symbols whose names match that expression. For example, the following statement imports from A.mfl all symbols whose names begin with ‘foo’ and contain at least one digit after it:
from A import '/^foo.*[0-9]/'.
The type of regular expressions used in the ‘from’ statement is
controlled by #pragma regex
(see regex).
Regular expression may be followed by a s-expression, i.e. a
sed
-like expression of the form:
s/regexp/replace/[flags]
where regexp is a regular expression, replace is a replacement for each part of the input that matches regexp. S-expressions and their parts are discussed in detail in s-expression.
The effect of such construct is to import all symbols that match the regular expression and apply the s-expression to their names.
For example:
from A import '/^foo.*[0-9]/s/.*/my_&/'.
This statement imports all symbols whose names begin with ‘foo’ and contain at least one digit after it, and renames them, by prefixing their names with the string ‘my_’. Thus, if A.mfl declared a function ‘foo_1’, it becomes visible under the name of ‘my_foo_1’.
Next: Preprocessor, Previous: Modules, Up: MFL [Contents][Index]
Native mailfromd modules described above rely on the functions
provided by the mailfromd
binary. For more sophisticated
tasks you might need to use C functions, either for efficiency reasons
or to make use of some third-party library. This is possible using
special kind of modules called mfmod.
An mfmod consists of two major parts: a dynamically loaded library that provides its main functionality and a small interface mailfromd module. The convention is that for the module x the library is named mfmod_x.so19, and the interface module file is x.mfl.
At the time of this writing, three mfmods exist:
Provides support for Perl-compatible regular expressions. It also contains a special function for scanning an email message for a match to a regular expression. See Mfmod_pcre in Mfmod_pcre.
Functions for searching in LDAP directory. See Mfmod_ldap in Mfmod_ldap.
Openmetrics support for mailfromd
. See mfmod_openmetrics in mfmod_openmetrics reference.
The subsections below describe the internal structure of an mfmod in detail.
• Loadable Library | ||
• Interface Module | ||
• mfmodnew | Creating a Mfmod Structure |
Next: Interface Module, Up: mfmod [Contents][Index]
External functions in the loadable library must be declared as
int funcname(long count, MFMOD_PARAM *param, MFMOD_PARAM *retval);
The MFMOD_PARAM
type is declared in the header file
mailfromd/mfmod.h, which must be included at the start of the
source code.
This type is defined as follows:
typedef struct mfmod_param { mfmod_data_type type; union { char *string; long number; mu_message_t message; }; } MFMOD_PARAM;
The type
fields defines the type of the data represented by the
object. Its possible values are:
String data.
Numeric data.
A mailutils
message object (mu_message_t
).
The actual data are accessed as string
, number
, or
message
, depending on the value of type
.
The first parameter in the external function declaration, count,
is the number of arguments passed to that function. Actual arguments are
passed in the MFMOD_PARAM
array param. The function should
never modify its elements. If the function returns a value to MFL, it
must pass it in the retval parameter. For example, the
following code returns the numeric value ‘1’:
retval->type = mfmod_number; retval->number = 1;
To return a string value, allocate it using malloc
,
calloc
or a similar function, like this:
retval->type = mfmod_string; retval->string = strdup("text");
If a message is returned, it should be created using mailutils message
creation primitives. Mailutils
will call
mu_message_destroy
on it, when it is no longer used.
The return value (in the C sense) of the function is used to determine
whether it succeeded or not. Zero means success. Returning -1 causes
a runtime exception e_failure
with a generic error text
indicating the names of the module and function that caused the
exception. Any other non-zero value is treated as a
mailfromd
exception code (see Exceptions). In this case
an additional textual explanation of the error can be supplied in the
retval
variable, whose type must then be set to mfmod_string
.
This explanation string must be allocated using malloc
.
To facilitate error handling, the following functions are provided (declared in the mailfromd/mfmod.h header file):
Raises exception ecode with the error message formatted from the
variadic arguments using printf
-style format string fmt.
Example use:
if (error_condition) return mfmod_error(retval, "error %s occurred", error_text);
Reports argument type mismatch error (e_inval
with
appropriately formatted error text). Arguments are:
The two arguments passed to the interface function.
0-based index of the erroneous argument in param.
Expected data type of param[n]
.
You will seldom need to use this function directly. Instead, use the
ASSERT_ARGTYPE
macro described below.
Returns the MFL name of the mfmod data type type.
The following convenience macros are provided for checking the number of argument and their types and returning error if necessary:
Assert that the number of arguments (count) equals the expected
number (expcount). If it does not, return the e_inval
exception with a descriptive error text.
retval and count are corresponding arguments from the calling function.
Check if the data type of the nth parameter
(i.e. param[n]
) is exptype and return the
e_inval
exception if it does not.
As an example, suppose you want to write an interface to the system
crypt
function. The loadable library source,
mfmod_crypt.c, will look as follows:
#include <stdlib.h> #include <unistd.h> #include <string.h> #include <mailfromd/mfmod.h> #include <mailfromd/exceptions.h> /* * Arguments: * param[0] - key string to hash. * param[1] - salt value. */ int cryptval(long count, MFMOD_PARAM *param, MFMOD_PARAM *retval) { char *hash; /* Check if input arguments are correct: */ ASSERT_ARGCOUNT(retval, count, 2); ASSERT_ARGTYPE(param, retval, 0, mfmod_string); ASSERT_ARGTYPE(param, retval, 1, mfmod_string); /* Hash the key string. */ hash = crypt(param[0].string, param[1].string); /* Return string to MFL */ retval->type = mfmod_string; retval->string = strdup(hash); /* Throw exception if out of memory */ if (retval->string == NULL) return -1; return 0; }
The exact way of building a loadable library from this source file depends on the operating system. For example, on GNU/Linux you would do:
cc -shared -fPIC -DPIC -omfmod_crypt.so -lcrypt mfmod_crypt.c
The preferred and portable way of doing so is via libtool
(see Shared library support for GNU in Libtool).
Mailfromd
provides a special command mfmodnew
that
creates infrastructure necessary for building loadable modules.
See mfmodnew.
Next: mfmodnew, Previous: Loadable Library, Up: mfmod [Contents][Index]
The interface module is responsible for loading the library, and providing MFL wrappers over external functions defined in it.
For the first task, the dlopen
function is provided. It takes
a single argument, the file name of the library to load. This can
be an absolute pathname, in which case it is used as is, or a relative
file name, which will be searched in the library search path
(see mfmod-path). On success, the function returns the
library handle, which will be used in subsequent calls to
identify that library. On error, a runtime exception is signalled.
It is common to call the dlopen
function in the startup
section of the interface module (see startup/shutdown), so that
the library gets loaded at the program startup. For example:
static number libh prog startup do set libh dlopen("mfmod_crypt.so") done
The function dlcall
is provided to call a function from the
already loaded library. It is a variadic function with three
mandatory parameters:
dlopen
.
The type string argument declares data types of the variable
arguments. It contains a single letter for each additional argument
passed to dlcall
. The valid letters are:
The argument is of string type.
The argument is of numeric type.
The argument is of message type.
For example, the following will call the cryptval
function
defined in the previous section (supposing key
and salt
are two string MFL variables):
set x dlcall(libh, "cryptval", "ss", key, salt)
The last letter in type string can be ‘+’ or ‘*’. Both mean that any number of arguments are allowed (all of the type given by the penultimate type letter). The difference between the two is that ‘+’ allows for one or more arguments, while ‘*’ allows for zero or more arguments. For example, ‘n+’ means one or more numeric arguments, and ‘n*’ means zero or more such arguments. Both are intended to be used in variadic functions, e.g.:
func pringstddev(number ...) returns number do return dlcall(libh, "stddev", "n*", $@) done
The dlcall
function returns the value returned by the
library function it invoked. If the library function returns
no meaningful value, it is recommended to use the void
type cast around the dlcall
invocation (see void type cast). E.g.:
func out(string text) do void(dlcall(libh, "output", "s", text)) done
Without void
type cast, the definition above will produce the
following warning when compiled:
return from dlcall is ignored
Previous: Interface Module, Up: mfmod [Contents][Index]
The mfmodnew
provides a convenient start for writing a new
mfmod. Given a name of the planned module, this command creates
directory mfmod_name and populates it with the files
necessary for building the new module using GNU autotools, as well as
boilerplate files for the loadable library and interface module.
Let’s see how to use it to create the crypt
module outlined in
previous subsections.
First, invoke the command:
$ mfmodnew crypt mfmodnew: setting up new module in mfmod_crypt
Let’s change to the new directory and see the files in it:
$ cd mfmod_crypt $ ls Makefile.am configure.ac crypt.mfl mfmod_crypt.c
Now, open the mfmod_crypt.c file and add to it the definition
of the cryptval
function (see Loadable Library). Then, add
the interface function definition from Interface Module to file
crypt.mfl
.
The last thing to do is to edit configure.ac. The
crypt
function requires the libcrypt library, so the
following line should be added to the ‘Checks for libraries.’
section.
AC_CHECK_LIB([crypt], [crypt])
Now, run autoreconf
, as follows:
$ autoreconf -f -i -s
It will bootstrap the autotools infrastructure, importing additional files as necessary. Once done, you can build the project:
$ ./configure $ make
Notice, that if the autoreconf
stage ends abnormally with a
diagnostics like:
configure.ac:21: error: possibly undefined macro: AC_MFMOD
that means that autoconf
was unable to find the file
mfmod.m4, which provides that macro. That’s because the
directory where this file is installed is not searched by
autoreconf
. To fix this, supply the name of that directory
using the -I option. E.g. assuming mfmod.m4 is
installed in /usr/local/share:
$ autoreconf -fis -I /usr/local/share/aclocal
• mfmodnew invocation |
The mfmodnew
is invoked as:
mfmodnew [options] modname [dir]
where modname is the name of the new module and dir is the directory where to store the module infrastructure files. Normally you would omit dir altogether: in this case the utility will use mfmod_modname as the directory name.
Options are:
Search for template files in dir, instead of the default location.
Supply the author’s email. The email is passed as argument to
the AC_INIT
macro in configure.ac. By default it
is constructed as ‘username@hostname’. If it is
incorrect, you can either edit configure.ac afterwards, or
just supply the correct one using this option.
Suppress informative messages.
Display a short command line usage help.
Next: Filter Script Example, Previous: mfmod, Up: MFL [Contents][Index]
Before compiling the script file, mailfromd
preprocesses
it. The built-in preprocessor handles only file inclusion
(see include), while the rest of traditional facilities, such as
macro expansion, are supported via m4
, which is used as an
external preprocessor.
The detailed description of m4
facilities lies far beyond
the scope of this document. You will find a complete user manual in
GNU M4 in GNU M4 macro processor. For the
rest of this section we assume the reader is sufficiently
acquainted with m4
macro processor.
The external preprocessor is invoked with -s flag, instructing
it to include line synchronization information in its output, which
is subsequently used by MFL compiler for purposes of error
reporting. The initial set of macro definitions is supplied in
preprocessor setup file pp-setup, located in the library
search path20,
which is fed to the preprocessor input before the script file itself.
The default pp-setup file renames all m4
built-in
macro names so they all start with the prefix ‘m4_’21. It changes comment characters to ‘/*’, ‘*/’ pair,
and leaves the default quoting characters, grave (‘`’) and acute
(‘'’) accents without change. Finally, pp-setup defines
several useful macros (see m4 macros).
• Preprocessor Configuration | ||
• Preprocessor Usage | ||
• Preprocessor Macros |
Next: Preprocessor Usage, Up: Preprocessor [Contents][Index]
The preprocessor is configured in the mailfromd configuration file, using the preprocessor statement (see conf-preprocessor). The default settings correspond to the following configuration:
preprocessor { # Enable preprocessor enable yes; # Preprocessor command line stub. command "m4 -s"; # Pass current include path to the preprocessor via -I options. pass-includes false; # Pass to the preprocessor the feature definitions via -D options # as well as any -D/-U options from the command line pass-defines true; # Name of the preprocessor setup file. Unless absolute, it is # looked up in the include path. setup-file "pp-setup"; }
If pass-includes
is true, the command
value
is augmented by zero or more -I options supplying it the
mailfromd include search path (see include search path).
Furthermore, if pass-defines
is set, zero or more
-D options defining optional features are passed to it (e.g.
-DWITH_DKIM) as well as any -D and -U
options from the mailfromd command line.
Unless the value of setup-file
begins with a slash,
the file with this name is looked up in the current include search
path. If found, its absolute name is passed to the preprocessor as
first argument.
If it begins with a slash, it is passed to the preprocessor as is.
Next: m4 macros, Previous: Configuring Preprocessor, Up: Preprocessor [Contents][Index]
You can obtain the preprocessed output, without starting actual compilation, using -E command line option:
$ mailfromd -E file.mfl
The output is in the form of preprocessed source code, which is sent to the standard output. This can be useful, among others, to debug your own macro definitions.
Macro definitions and deletions can be made on the command line, by
using the -D and -U options, provided that their use
is allowed by the pass-defines
preprocessor configuration
setting (see Configuring Preprocessor. They have the following format:
Define a symbol name to have a value value. If
value is not supplied, the value is taken to be the empty
string. The value can be any string, and the macro can be
defined to take arguments, just as if it was defined from within the
input using the m4_define
statement.
For example, the following invocation defines symbol COMPAT
to
have a value 43
:
$ mailfromd -DCOMPAT=43
A counterpart of the -D option is the option -U
(--undefine). It undefines a preprocessor symbol whose name
is given as its argument. The following example undefines the symbol
COMPAT
:
$ mailfromd -UCOMPAT
The following two options are supplied mainly for debugging purposes:
Disables the external preprocessor.
Use command as external preprocessor. If command is not
supplied, use the default preprocessor, overriding the enable
preprocessor configuration setting.
Be especially careful with this option, because mailfromd
cannot verify whether command is actually some kind of a
preprocessor or not.
Previous: Preprocessor Usage, Up: Preprocessor [Contents][Index]
The identifier must be the name of an optional abstract
argument to the function. This macro must be used only within a function
definition. It expands to the MFL expression that yields
true
if the actual parameter is supplied for identifier.
For example:
func rcut(string text; number num) returns string do if (defined(num)) return substr(text, length(text) - num) else return text fi done
This function will return last num characters of text if num is supplied, and entire text otherwise, e.g.:
rcut("text string") ⇒ "text string" rcut("text string", 3) ⇒ "ing"
Invoking the defined
macro with the name of a mandatory argument
yields true
Provides a printf
statement, that formats its optional
parameters in accordance with format and sends the resulting
string to the current log output (see Logging and Debugging).
See String formatting, for a description of format.
Example usage:
printf('Function %s returned %d', funcname, retcode)
A convenience macro. Expands to a call to gettext
(see NLS Functions).
This macro intends to compensate for the lack of array data type in MFL. It splits the string list into segments delimited by string delim. For each segment, the MFL code code is executed. The code can use the variable var to refer to the segment string.
For example, the following fragment prints names of all existing
directories listed in the PATH
environment variable:
string path getenv("PATH") string seg string_list_iterate(path, ":", seg, ` if access(seg, F_OK) echo "%seg exists" fi')
Care should be taken to properly quote its arguments. In the code
below the string str
is treated as a comma-separated list of
values. To avoid interpreting the comma as argument delimiter the
second argument must be quoted:
string_list_iterate(str, `","', seg, ` echo "next segment: " . seg')
A convenience macro, that expands to msgid verbatim. It is
intended to mark the literal strings that should appear in the
.po file, where actual call to gettext
(see NLS Functions) cannot be used. For example:
/* Mark the variable for translation: cannot use gettext here */ string message N_("Mail accepted") prog envfrom do … /* Translate and log the message */ echo gettext(message)
Next: Reserved Words, Previous: Preprocessor, Up: MFL [Contents][Index]
In this section we will discuss a working example of the filter script file. For the ease of illustration, it is divided in several sections. Each section is prefaced with a comment explaining its function.
This filter assumes that the mailfromd.conf file contains the following:
relayed-domain-file (/etc/mail/sendmail.cw, /etc/mail/relay-domains); io-timeout 33; database cache { negative-expire-interval 1 day; positive-expire-interval 2 weeks; };
Of course, the exact parameter settings may vary, what is important
is that they be declared. See Mailfromd Configuration, for a
description of mailfromd
configuration file syntax.
Now, let’s return to the script. Its first part defines the configuration settings for this host:
#pragma regex +extended +icase set mailfrom_address "<>" set ehlo_domain "gnu.org.ua"
The second part loads the necessary source modules:
require 'status' require 'dns' require 'rateok'
Next we define envfrom
handler. In the first two rules, it
accepts all mails coming from the null address and from the machines
which we relay:
prog envfrom do if $f = "" accept elif relayed hostname($client_addr) accept elif hostname($client_addr) = $client_addr reject 550 5.7.7 "IP address does not resolve"
Next rule rejects all messages coming from hosts with dynamic IP addresses. A regular expression used to catch such hosts is not 100% fail-proof, but it tries to cover most existing host naming patterns:
elif hostname($client_addr) matches ".*(adsl|sdsl|hdsl|ldsl|xdsl|dialin|dialup|\ ppp|dhcp|dynamic|[-.]cpe[-.]).*" reject 550 5.7.1 "Use your SMTP relay"
Messages coming from the machines whose host names contain something similar to an IP are subject to strict checking:
elif hostname($client_addr) matches ".*[0-9]{1,3}[-.][0-9]{1,3}[-.][0-9]{1,3}[-.][0-9]{1,3}.*" on poll host $client_addr for $f do when success: pass when not_found or failure: reject 550 5.1.0 "Sender validity not confirmed" when temp_failure: tempfail done
If the sender domain is relayed by any of the ‘yahoo.com
’
or ‘nameserver.com
’ ‘MX’s, no checks are performed. We
will greylist this message in envrcpt
handler:
elif $f mx fnmatches "*.yahoo.com" or $f mx fnmatches "*.namaeserver.com" pass
Finally, if the message does not meet any of the above conditions, it is verified by the standard procedure:
else on poll $f do when success: pass when not_found or failure: reject 550 5.1.0 "Sender validity not confirmed" when temp_failure: tempfail done fi
At the end of the handler we check if the sender-client pair does not exceed allowed mail sending rate:
if not rateok("$f-$client_addr", interval("1 hour 30 minutes"), 100) tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later" fi done
Next part defines the envrcpt
handler. Its primary purpose
is to greylist messages from some domains that could not be checked
otherwise:
prog envrcpt do set gltime 300 if $f mx fnmatches "*.yahoo.com" or $f mx fnmatches "*.namaeserver.com" and not dbmap("/var/run/whitelist.db", $client_addr) if greylist("$client_addr-$f-$rcpt_addr", gltime) if greylist_seconds_left = gltime tempfail 450 4.7.0 "You are greylisted for %gltime seconds" else tempfail 450 4.7.0 "Still greylisted for " . %greylist_seconds_left . " seconds" fi fi fi done
Previous: Filter Script Example, Up: MFL [Contents][Index]
For your reference, here is an alphabetical list of all reserved words:
Several keywords are context-dependent: mx
is a keyword if it
appears before matches
or fnmatches
. Following strings
are keywords in on
context:
The following keywords are preprocessor macros:
Any keyword beginning with a ‘m4_’ prefix is a reserved preprocessor symbol.
There are two noteworthy exceptions:
module
and from ... import
statements, which must be
terminated with a period. For details, refer to module structure, and import.
Implementation note: actually, the references are not interpreted within the string, instead, each such string is split at compilation time into a series of concatenated atoms. Thus, our sample string will actually be compiled as:
$f . " last connected from " . last_ip . ";"
See Concatenation, for a description of this construct. You can easily see how various strings are interpreted by using --dump-tree option (see --dump-tree). In this case, it will produce:
CONCAT: CONCAT: CONCAT: SYMBOL: f CONSTANT: " last connected from " VARIABLE last_ip (13) CONSTANT: ";"
The subexpressions are numbered by the positions of their opening parentheses, left to right.
The subexpressions are numbered by the positions of their opening parentheses, left to right.
Notice that these are intended for educational purposes and do not necessarily coincide with the actual definitions of these functions in Mailfromd version 9.0.
The only exception is ‘not’, whose precedence in MFL is much lower than usual (in most programming languages it has the same precedence as unary ‘-’). This allows to write conditional expressions in more understandable manner. Consider the following condition:
if not x < 2 and y = 3
It is understood as “if x
is not less than 2 and y
equals 3”,
whereas with the usual precedence for ‘not’ it would have meant
“if negated x
is less than 2 and y
equals 3”.
The default value for code is
550 for reject
and
451 for tempfail
. The remaining two
arguments default to empty strings.
As of mailfromd
version
9.0, there is also a third implicit argument, which holds
the value of program counter where the exception occurred. Currently
it is considered to be an implementation artifact. Filter writers are
discouraged from relying on it.
This function is
part of the mailfromd
library, See hasmx.
The actual suffix depends on operating system. It is ‘.so’ on all POSIX systems.
It is usually located in /usr/local/share/mailfromd/9.0/include/pp-setup.
This
is similar to GNU m4 --prefix-builtin options. This approach
was chosen to allow for using non-GNU m4
implementations as
well.
Previous: Filter Script Example, Up: MFL [Contents][Index]