Character Sets

QNX SDP8.0C Library ReferenceAPIDeveloper

Character sets represent the mapping between numeric codes and characters. A numeric code may use one or more bytes to display a character.

Execution character sets

Execution character sets include the available set of characters in an execution environment and are defined by the QNX OS implementation.

Execution character sets include:
  • Single-byte character set (type char) — Uses one byte to store a character.
  • Multibyte character set (type char) — Encodes characters as UTF-8; uses one or more bytes to represent complex characters.
  • Wide character set (type wchar_t) — Encodes characters as UTF-32.
  • 16-bit character set (char16_t) — Stores Unicode encoded as UTF-16.
  • 32-bit character set (char32_t) — Stores Unicode encoded as UTF-32.
Note:
QNX advises against using wide and multibyte character sets of type wchar_t as they are in an experimental state. QNX OS ships the International Components for Unicode (ICU) libraries (libicu*) that you can use instead. For more information about ICU, see https://icu.unicode.org/home.

To understand how conversions between multibyte characters and characters of type char16_t and char32_t are handled, see the functions c16rtomb() and mbrtoc16(). These functions handle conversions between UTF-8 and UTF-16 or UTF-32 strings.

Alphabetic escape sequences

Alphabetic escape sequences are strings in the execution character set that represent an action rather than plain characters. These actions include backspace, vertical and horizontal tab, new line, and so on. For more information, see https://en.cppreference.com/w/cpp/language/escape.

Environment macros

When type wchar_t, char16_t, and char32_t can follow the Unicode standard, the compiler or C library defines the following environment macros for the implementation, which affect the handling of environment character sets:
__STDC_ISO_10646__
Defined when type wchar_t can hold the short identifier of a Unicode character and mbtowc() and mbrtowc()is converted to Unicode. The <platform.h> header file defines this macro with the value 200009L.
__STDC_MB_MIGHT_NEQ_WC__
Indicates that the basic character set (ie., single byte character set) may not have the same value as type wchar_t. Neither the qcc compiler nor the C library defines this macro.
__STDC_UTF_16__
Indicates type char16_t is UTF16 encoded. This relates to the conversion behaviour of mbrtoc16(). The qcc compiler defines this macro with the value of 1.
__STDC_UTF_32__
Indicates type char32_t is UTF32 encoded. This relates to the conversion behavior of mbrtoc32(). The qcc compiler defines this macro with the value of 1.
Page updated: