On some targets, the internal representation is UTF-16, which means that non-BMP Unicode codepoints are represented using surrogate pairs. The compile-time define target.utf16 is set when the target uses UTF-16 internally.
Some Haxe targets disallow null-bytes (Unicode codepoint 0) in strings. Additionally, some Haxe core APIs assume a null-byte terminates strings. To consistently deal with binary data, including null-bytes, use the haxe.io.Bytes API.
| Target | target.unicode | target.utf16 | Internal encoding | Null-byte allowed |
|---|---|---|---|---|
| Flash | yes | yes | UTF-16 | no |
| JavaScript | yes | yes | UTF-16 | yes (except in some old browsers) |
| C++ | yes | yes | ASCII or UTF-16 (if needed) | yes |
| Java | yes | yes | UTF-16 | yes |
| JVM | yes | yes | UTF-16 | yes |
| C# | yes | yes | UTF-16 | yes |
| Python | yes | no | Latin-1, UCS-2, or UCS-4 (see PEP 393) | yes |
| Lua | yes | no | UTF-8 | yes |
| PHP | yes | no | binary | yes |
| Eval | yes | no | UTF-8 | yes |
| Neko | no | no | binary | yes |
| HashLink | yes | yes | UTF-16 | no |