On some targets, the internal representation is UTF-16, which means that non-BMP Unicode codepoints are represented using surrogate pairs. The compile-time define target.utf16
is set when the target uses UTF-16 internally.
Some Haxe targets disallow null-bytes (Unicode codepoint 0) in strings. Additionally, some Haxe core APIs assume a null-byte terminates strings. To consistently deal with binary data, including null-bytes, use the haxe.io.Bytes
API.
Target | target.unicode | target.utf16 | Internal encoding | Null-byte allowed |
---|---|---|---|---|
Flash | yes | yes | UTF-16 | no |
JavaScript | yes | yes | UTF-16 | yes (except in some old browsers) |
C++ | yes | yes | ASCII or UTF-16 (if needed) | yes |
Java | yes | yes | UTF-16 | yes |
JVM | yes | yes | UTF-16 | yes |
C# | yes | yes | UTF-16 | yes |
Python | yes | no | Latin-1, UCS-2, or UCS-4 (see PEP 393) | yes |
Lua | yes | no | UTF-8 | yes |
PHP | yes | no | binary | yes |
Eval | yes | no | UTF-8 | yes |
Neko | no | no | binary | yes |
HashLink | yes | yes | UTF-16 | no |