This version of this document is no longer maintained. For the latest documentation, see http://www.qnx.com/developers/docs.

utf8len()

Count the bytes in a UTF-8 character

Synopsis:

#include <utf8.h>

int utf8len( const char *s,
             size_t n );

ph

The utf8str() function counts the number of bytes in the UTF-8 character pointed to by s, to a maximum of n bytes, if n is nonzero.

This function is similar to mblen(), except that:

utf8str() isn't affected by the current locale.
The s argument isn't allowed to be NULL.
You can pass 0 for n if you know that s points to a null-terminated string (i.e. 0 is equivalent to, but more efficient than, strlen(s)).
utf8str() returns -1 if s points to an invalid byte sequence. If n is nonzero and the n bytes pointed to by s look like an incomplete but potentially valid character, the function returns the negative total length of that (complete) character (this is in the range from -2 to -UTF8_LEN_MAX).

0: s points to the null character.
> 0: The number of bytes that comprise the multibyte character (if the next n or fewer bytes form a valid multibyte character).
-1: The n-byte sequence that s points to isn't a valid (beginning of a) UTF-8-encoded character.
Other negative value: The n bytes pointed to by s could be the initial bytes of a valid UTF-8 sequence.

Photon