Using Regular Expressions
Haxe has builtin support for Regular Expressions. They can be used to verify the format of a string or extract some regular data from a given text. A regular expression starts with ~/ and ends with a single / :
var r : EReg = ~/world/; var str = "hello world"; trace(r.match(str)); // true : 'world' was found in the string trace(r.match("hello !")); // false
You can use standard Regular Expressions patterns such as (not exclusively) :
.: any character*: repeat zero-or-more+: repeat one-or-more?: optional zero-or-one[A-Z0-9]: character ranges[^\r\n\t]: character not-in-range(...): parenthesis to match groups of characters^: beginning of string/line$: end of string/line|: "OR" statement.
For example, the following regular expression match a valid email address :
~/[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z][A-Z][A-Z]?/i;
Please notice that the i at the end of the regular expression is a Flag that enable case-insensitive matching.
The possible flags are the following :
i: case insensitive matchingg: global replace or split, see belowm: multiline matching,^and$represent only the beginning and end of the strings: the dot.will match also newlines (Haxe/Neko only)u: use utf8 matching (Haxe/Neko only)
Groups
You can extract some informations by using groups :
var str = "Nicolas is 26 years old"; var r = ~/([A-Za-z]+) is ([0-9]+) years old/; r.match(str); trace(r.matched(1)); // "Nicolas" trace(r.matched(2)); // "26"
The r.matched(0) result will always return the whole matched substring, and r.matchedPos() will return the position of this substring in the original string :
var str = "abcdeeeeefghi"; var r = ~/e*/; r.match(str); trace(r.matched(0)); // "eeeee" trace(r.matchedPos()); // { pos : 4, len : 5 }
Replace
A regular expression can also be used to replace a part of the string :
var str = "aaabcbcbcbz"; var r = ~/b[^c]/g; // g : replace all instances trace(r.replace(str,"xx")); // "aaabcbcbcxx"
You can use $X to reuse a matched group in the replacement :
var str = "{hello} {0} {again}"; var r = ~/{([a-z]+)}/g; trace(r.replace(str,"*$1*")); // "*hello* {0} *again*"
Split
A regular expression can also be used to split a string into several substrings. In that case, the delimiter used to split is not a constant string but a regular expression :
var str = "XaaaYababZbbbW"; var r = ~/[ab]+/g; trace(r.split(str)); // ["X","Y","Z","W"]
Implementation Details
Regular Expressions are implemented :
- in Javascript, the Browser is providing the implementation with the object RegExp.
- in Neko, the PCRE library is used
- in Flash9, the native implementation is used
- FIXME in Flash 6/8, the implementation is not yet available but will a pure Haxe version (hence very slow since it's not native, but compatible)