Next: , Previous: , Up: UTF-8   [Contents][Index]


D.10.2 Iterating over UTF-8 strings

struct: utf8_iterator

A data type for iterating over a string of UTF-8 characters. Defined as:

struct utf8_iterator {
    char *string;
    char *curptr;
    unsigned curwidth;
};

When iterating over characters in string, curptr points to the current character, and curwidth holds its length in bytes.

Function: int utf8_iter_isascii (struct utf8_iterator itr)

Returns ‘true’ if itr points to a ASCII character.

Function: int utf8_iter_end_p (struct utf8_iterator *itr)

Returns ‘true’ if itr reached end of the input string.

Function: int utf8_iter_first (struct utf8_iterator *itr, unsigned char *str)

Initializes itr for iterating over the string str. On success, positions itr.curptr to the next character from the input string, sets itr.curwidth to the length of that character in bytes, and returns ‘0’. If str is an empty string, returns ‘1’.

Function: int utf8_iter_next (struct utf8_iterator *itr)

Positions itr.curptr to the next character from the input string. Sets itr.curwidth to the length of that character in bytes.