Next: util, Previous: url, Up: The Libdico Library [Contents][Index]
This section describes functions for handling UTF-8 strings. A UTF-8 character can be represented either as a multi-byte character or a wide character.
Multibyte character is a char *
pointing to one or more
bytes representing the UTF-8 character.
Wide character is an unsigned
value identifying the
character.
In the discussion below, a sequence of one or more multi-byte characters is called a multi-byte string. Multibyte strings terminate with a single ‘nul’ (0) character.
A sequence of one or more wide characters is called a wide character string. Such strings terminate with a single 0 value.