10.1.3 Encoding

On some targets, the internal representation is UTF-16, which means that non-BMP Unicode codepoints are represented using surrogate pairs. The compile-time define target.utf16 is set when the target uses UTF-16 internally.

Null-bytes in strings

Some Haxe targets disallow null-bytes (Unicode codepoint 0) in strings. Additionally, some Haxe core APIs assume a null-byte terminates strings. To consistently deal with binary data, including null-bytes, use the haxe.io.Bytes API.

Target details
Targettarget.unicodetarget.utf16Internal encodingNull-byte allowed
FlashyesyesUTF-16no
JavaScriptyesyesUTF-16yes (except in some old browsers)
C++yesyesASCII or UTF-16 (if needed)yes
JavayesyesUTF-16yes
JVMyesyesUTF-16yes
C#yesyesUTF-16yes
PythonyesnoLatin-1, UCS-2, or UCS-4 (see PEP 393)yes
LuayesnoUTF-8yes
PHPyesnobinaryyes
EvalyesnoUTF-8yes
Nekononobinaryyes
HashLinkyesyesUTF-16no