Next: , Previous: , Up: The Libdico Library   [Contents][Index]


D.10 UTF-8

This section describes functions for handling UTF-8 strings. A UTF-8 character can be represented either as a multi-byte character or a wide character.

Multibyte character is a char * pointing to one or more bytes representing the UTF-8 character.

Wide character is an unsigned value identifying the character.

In the discussion below, a sequence of one or more multi-byte characters is called a multi-byte string. Multibyte strings terminate with a single ‘nul’ (0) character.

A sequence of one or more wide characters is called a wide character string. Such strings terminate with a single 0 value.