Next: String manipulation, Previous: Character translation, Up: Library [Contents][Index]
sed
functionThe sed
function allows you to transform a string by replacing
parts of it that match a regular expression with another string. This
function is somewhat similar to the sed
command line utility
(hence its name) and bears similarities to analogous functions in
other programming languages (e.g. sub
in awk
or the
s//
operator in perl
).
The expr argument is an s-expressions of the the form:
s/regexp/replacement/[flags]
where regexp is a regular expression, and replacement is a
replacement string for each part of the subject that matches
regexp. When sed
is invoked, it attempts to match
subject against the regexp. If the match succeeds, the
portion of subject which was matched is replaced with
replacement. Depending on the value of flags
(see global replace), this process may continue until the entire
subject has been scanned.
The resulting output serves as input for next argument, if such is supplied. The process continues until all arguments have been applied.
The function returns the output of the last s-expression.
Both regexp and replacement are described in detail in The ‘s’ Command in GNU sed.
Supported flags are:
Apply the replacement to all matches to the regexp, not just the first.
Use case-insensitive matching. In the absence of this flag, the value
set by the recent #pragma regex icase
is used (see icase).
regexp is an extended regular expression (see Extended regular expressions in GNU sed). In the absence of this flag, the value set by the
recent #pragma regex extended
(if any) is used (see extended).
Only replace the numberth match of the regexp.
Note: the POSIX standard does not specify what should happen
when you mix the ‘g’ and number modifiers. Mailfromd
follows the GNU sed
implementation in this regard, so
the interaction is defined to be: ignore matches before the
numberth, and then match and replace all matches from the
numberth on.
Any delimiter can be used in lieue of ‘/’, the only requirement being that it be used consistently throughout the expression. For example, the following two expressions are equivalent:
s/one/two/ s,one,two,
Changing delimiters is often useful when the regex contains
slashes. For instance, it is more convenient to write s,/,-,
than
s/\//-/
.
Here is an example of sed
usage:
set email sed(input, 's/^<(.*)>$/\1/x')
It removes angle quotes from the value of the ‘input’ variable and assigns the result to ‘email’.
To apply several s-expressions to the same input, you can either give
them as multiple arguments to the sed
function:
set email sed(input, 's/^<(.*)>$/\1/x', 's/(.+@)(.+)/\1\L\2\E/x')
or give them in a single argument separated with semicolons:
set email sed(input, 's/^<(.*)>$/\1/x;s/(.+@)(.+)/\1\L\2\E/x')
Both examples above remove optional angle quotes and convert the domain name part to lower case.
Regular expressions used in sed
arguments are controlled by
the #pragma regex
, as another expressions used throughout the
MFL source file. To avoid using the ‘x’ modifier in the above
example, one can write:
#pragma regex +extended set email sed(input, 's/^<(.*)>$/\1/', 's/(.+@)(.+)/\1\L\2\E/')
See regex, for details about that #pragma
.
So far all examples used constant s-expressions. However, this is
not a requirement. If necessary, the expression can be stored in a
variable or even constructed on the fly before passing it as argument
to sed
. For example, assume that you wish to remove the domain
part from the value, but only if that part matches one of predefined
domains. Let a regular expression that matches these domains be
stored in the variable domain_rx
. Then this can be done as
follows:
set email sed(input, "s/(.+)(@%domain_rx)/\1/")
If the constructed regular expression uses variables whose value should be matched exactly, such variables must be quoted before being used as part of the regexp. Mailfromd provides a convenience function for this:
Quote the string str as a regular expression. This function selects the characters to be escaped using the currently selected regular expression flavor (see regex). At most two additional characters that must be escaped can be supplied in the delim optional parameter. For example, to quote the variable ‘x’ for use in double-quoted s-expression:
qr(x, '/"')
Next: String manipulation, Previous: Character translation, Up: Library [Contents][Index]