You are here: Reference > JavaScript > core > objects > RegExp

Regular Expression (RegExp) object

Browser support:
With the RegExp object, regular expressions can be used in JavaScript.
Regular expressions provide a flexible way to identify characters, words and series of characters. The RegExp object can be used to find strings that can be described with regular expression rules.

Syntax:

Creating a RegExp object:
var re = new RegExp("pattern"[, "flags"]);
var re = /pattern/flags;
pattern Required. The regular expression.
flags Optional. Can be a combination of the following values:

Pattern flags:

g - global matches. For example, /ab/g means matching all occurrences of the 'ab' text while /ab/ means matching the first occurrence of the 'ab' text.
i - case-insensitive matches. For example, /aB/i means matching to 'ab', 'aB', 'Ab' and 'AB'.
m - Matches multiline content as well.
The following RegExp object pairs are equivalent:
var re1 = new RegExp("ab");
var re2 = /ab/;
Did you find this example helpful? yes no
var re1 = new RegExp("ab", "g");
var re2 = /ab/g;
Did you find this example helpful? yes no
var re1 = new RegExp("ab", "ig");
var re2 = /ab/ig;
Did you find this example helpful? yes no
Note that in a string the '\' character has special meaning, therefore '\\' must be used instead of the '\' character.
var re1 = new RegExp("\\d*");
var re2 = /\d*/ ;
Did you find this example helpful? yes no

Some helpful regular expression patterns:

Simple character chains:
/abc/ - Matches the first occurrence of the 'abc' string.
/12/ - Matches the first occurrence of the '12' string.
Unlimited repetion of a character:
/ab*c/ - Matches the first occurrence of the string that begins with 'a', followed by any number of 'b' and ends with 'c'. Possible strings: 'ac', 'abc', 'abbc', 'abbbc', ...
/ab+c/ - Matches the first occurrence of the string that begins with 'a', followed by at least one 'b' and ends with 'c'. Possible strings: 'abc', 'abbc', 'abbbc', ...
/ab{2,}c/ - Matches the first occurrence of the string that begins with 'a', followed by two or more 'b' and ends with 'c'. Possible strings: 'abbc', 'abbbc', 'abbbbc', ...
Unlimited repetion of a text:
/(ab)*c/ - Matches the first occurrence of the string that begins with some 'ab', followed by 'c'. Possible strings: 'c', 'abc', 'ababc', 'abababc', ...
/(ab)+c/ - Matches the first occurrence of the string that begins with one or more 'ab', followed by 'c'. Possible strings: 'abc', 'ababc', 'abababc', ...
/(ab){2,}c/ - Matches the first occurrence of the string that begins with two or more 'ab', followed by 'c'. Possible strings: 'ababc', 'abababc', 'ababababc', ...
Limited repetion of a character:
/ab?c/ - Matches the first occurrence of the string that begins with 'a', followed by at most one 'b' and ends with 'c'. Possible strings: 'ac' 'abc'.
/ab{1,2}c/ - Matches the first occurrence of the string that begins with 'a', followed by one or two 'b' and ends with 'c'. Possible strings: 'abc' 'abbc'.
/ab{2}c/ - Matches the first occurrence of the 'abbc' string.
Limited repetion of a text:
/(ab)?c/ - Matches the first occurrence of the string that begins at most one 'ab', followed by 'c'. Possible strings: 'c' 'abc'.
/(ab){1,2}c/ - Matches the first occurrence of the string that begins with one or two 'ab', followed by 'c'. Possible strings: 'abc' 'ababc'.
/(ab){2}c/ - Matches the first occurrence of the 'ababc' string.
Match whole pattern only:
/^whole$/ - Checks if the analyzed string is 'whole'.
/^ca(t|r)$/ - Checks if the analyzed string is 'cat' or 'car'.
Character sets:
/[abc]/ - Matches the first occurrence of the 'a', 'b' or 'c' character.
/[ab]{1,2}/ - Matches the first occurrence of the 'a', 'b', 'aa', 'ab', 'ba' or 'bb' string.
/[^abc]/ - Matches the first occurrence of a character that is not 'a', 'b' and 'c'.
Numbers:
/\d/ - Matches the first occurrence of any digit.
/\d{1,2}/ - Matches the first occurrence of one or two digits.
/\d|[1-9]\d/ - Matches the first occurrence of an integer between 0 and 99. Similar to the previous regular expression, but '00', '01', '02', ..., '09' are not allowed.
/(\+|-)?(\d|[1-9]\d)/ - Matches the first occurrence of an integer between 0 and 99 with or without sign. Note that the '\' character needs to be used before the '+' character, because the '+' character has special meaning in regular expressions.
Whitespaces:
/\sabc/ - Matches the first occurrence of the string that begins with a whitespace, followed by 'abc'.
/\sabc\s/ - Matches the first occurrence of the string that begins with a whitespace, followed by 'abc' and end s with a whitespace.
/\s\S+\s/ - Matches the first occurrence of the string that begins with a whitespace, followed by at least one non-whitespace character and ends with a whitespace.

Special Characters:

Several special characters can be used in a regular expression. With these characters various patterns can be specified.
The following table lists the supported special characters.
Character Description
\ The backslash character has two different meanings, depending on the character that follows.
  • For an alphanumeric character, it indicates that it is a special character.
    For example, /d/ - matches a 'd' character, but /\d/ - matches a digit character.
    Some of the alphanumeric characters have no meaning together with the backslash character. Using the backslash character with these characters causes an error.
  • For a non-alphanumeric character, it indicates that it is a normal character.
    It can be useful if you need a pattern matching for a special character.
    For example, /a*/ - matches any number of 'a' character, but /a\*/- matches only the 'a*' string.
^ Indicates that the match must start at the beginning of the input string. If the multiline flag is specified, the match can start after a \r or \n character. For example, /^m/ has no match in 'woman', but has a match in 'man'
$ Indicates that the match must end at the end of the input string. If the multiline flag is specified, the match can end at a \r or \n character. For example, /m$/ has no match in 'format', but has a match in 'form'.
* Matches the preceding item zero or more times. Equivalent to {0,}. For example, pattern: /b*a/g - input: 'hubba and bubba'.
+ Matches the preceding item at least one time. Equivalent to {1,}. For example, pattern: /b+a/g - input: 'hubba and bubba'.
? Matches the preceding item zero or one time. Equivalent to {0,1}. For example, pattern: /b?a/g - input: 'hubba and bubba'.
. (dot) Matches any single character except the newline characters (\n, \r, \u2028 or \u2029). Use the [\s\S] pattern to match any character including newline characters. For example, pattern: /.s/g - input: 'sail the seas'.
(pattern) Matches pattern, and stores the match (called as capturing parentheses, captured subexpression or parenthesized subexpressions). The stored parts of a match can be retrieved through the $1, $2, ... , $9 properties.
(?:pattern) Matches pattern, but does not store the match (called as non-capturing parentheses, non-captured subexpression or non-parenthesized subexpressions).
pattern1(?=pattern2) Matches pattern1 only when pattern1 is followed by pattern2. For example, /a(?=b)/g matches 'a' letter only if the 'a' letter is followed by a 'b' letter. Pattern: /a(?=b)/g - input: 'abracadabra'.
pattern1(?!pattern2) Matches pattern1 only when pattern1 is not followed by pattern2. For example, /a(!=b)/g matches 'a' letter only if the 'a' letter is not followed by a 'b' letter. Pattern: /a(!=b)/g - input: 'abracadabra'.
pattern1|pattern2 Matches pattern1 or pattern2. For example, pattern: /(a|r)/g - input: 'agreed'.
{n} Where n is a nonnegative integer. Matches the preceding item exactly n times. For example, pattern: /9{2}/g - input: '79799799979999'.
{n,} Where n is a nonnegative integer. Matches the preceding item at least n times. For example, pattern: /9{2,}/g - input: '79799799979999'.
{n,m} Where n and m are nonnegative integers and n <= m. Matches the preceding item at least n and at most m times. For example, pattern: /9{2,3}/g - input: '79799799979999'.
[xyz] A character set. Matches any one of the enclosed characters. For example, pattern: /[adr]/g - input: 'agreed'.
[^xyz] A negated character set. Matches any character that is not enclosed. For example, pattern: /[^adr]/g - input: 'agreed'.
[a-z] A range of characters. Matches any character in the specified range. For example, pattern: /[a-d]/g - input: 'agreed'.
[^a-z] A negated range of characters. Matches any character that is not in the specified range. For example, pattern: /[^a-d]/g - input: 'agreed'.
\b Matches a word boundary, not a backspace! The [\b] pattern matches a backspace. A word boundary can be at the beginning or end of a word, it is the position between a word and a non-word character (see the \w and \W patterns). For example, pattern: /\ba/g - input: 'an abracadabra'.
\B Matches a non-word boundary. For example, pattern: /\Ba/g - input: 'an abracadabra'.
\cx Where x is a character (letter) from A-Z or a-z. Matches the control character. For example, \cI matches Control-I (tab).
\d Matches a digit. Equivalent to [0-9]. For example, pattern: /\d/g - input: 'a1b23c'.
\D Matches a non-digit. Equivalent to [^0-9]. For example, pattern: /\D/g - input: 'a1b23c'.
\f Matches a form-feed character. Equivalent to \x0c. (The form-feed character causes the printer to advance one page length or to the top of the next page.)
\n Matches a newline character. Equivalent to \x0a.
\r Matches a carriage return character. Equivalent to \x0d.
\s Matches any white space character. For example, pattern: /s\sa/g - input: 'Thats all'.
\S Matches any non-white space character. For example, pattern: /\Sa/g - input: 'Thats all'.
\t Matches a tab character. Equivalent to \x09.
\v Matches a vertical tab character. Equivalent to \x0b.
\w Matches an alphanumeric (letter or number) character including underscore. Equivalent to [A-Za-z0-9_]. For example, pattern: /\w+/g - input: '5+23'.
\W Matches a non-word (non-alphanumeric) character. Equivalent to [^A-Za-z0-9_]. For example, pattern: /\W+/g - input: '5+23'.
\n n is a positive integer. If the entire regular pattern contains at least n captured subexpressions before (see (pattern)), then the \n is a back reference to the nth subexpression, else n must be an octal code (see \ooo). For example, pattern: /(\d+)\+\1/g - input: '23+12, 23+23, 12+12+23'.
\0 Matches a NULL character.
\ooo Matches a character specified by its octal code ooo. For example (the code of the '+' character is \053), pattern: /\053/g - input: '23+12'.
\xhh Matches a character specified by its hexadecimal code hh. For example (the code of the '+' character is \x2B), pattern: /\x2B/g - input: '23+12'.
\uhhhh Matches a character specified by its Unicode code hhhh. For example (the code of the '+' character is \u002B), pattern: /\u002B/g - input: '23+12'.

Members:

The RegExp object inherits from the RegExp.prototype and Function.prototype objects. The following lists only contain the members of the RegExp and RegExp.prototype objects.

Properties:

Property Support Description
$1...$9
These properties are filled by the exec and test methods and contain the matching substrings of captured subexpressions (if any).
The $1...$9 properties are static, it cannot be accessed from an instance of the RegExp object, only RegExp.$1, ..., RegExp.$9 are allowed. They contain the result of only the last exec or test call. Therefore, when several instances of the RegExp object are used simultaneously, the $1...$9 properties do not provide satisfying functionality. In that case, avoid the use of the test method, use the exec method instead. The array returned by the exec method also contains the matching substrings of captured subexpressions.
See the Example 7 and 8 for details.
To read more about the array returned by the exec method, please click here.
global*
Returns a Boolean value that indicates whether the global matches flag is specified or not.
ignoreCase*
Returns a Boolean value that indicates whether the case-insensitive matches flag is specified or not.
index
This property is filled by the exec and test methods and contains the zero-based start position of the match in the input string. The index property is static, it cannot be accessed from an instance of the RegExp object, only RegExp.index is allowed. The RegExp.index property contains information only about the last exec or test call. For a better and cross-browser solution, use the index property of the array returned by the exec method.
See Example 3 and 4 for details.
To read more about the returned array. please click here.
input
This property is filled by the exec or the test method and contains the input string of the last search. The input property is static, it cannot be accessed from an instance of the RegExp object, only RegExp.input is allowed. The RegExp.input property contains information only about the last exec or test call. For a better and cross-browser solution use the input property of the array returned by the exec method. To read more about the returned array, please click here.
lastIndex*
This property is filled by the exec or the test method and contains the zero-based end position of the match in the input string. This property can be accessed from an instance of the RegExp object, unlike the index property. The static RegExp.lastIndex property is only supported by Internet Explorer.
See Example 3 and 4 for details.
multiline*
Returns a Boolean value that indicates whether the multiline matches flag is specified or not.
source*
Returns the pattern specified for the regular expression.
prototype
Returns a reference to the RegExp.prototype object. The RegExp.prototype object allows adding properties and methods to the RegExp object that can be used with instances of the RegExp object, like any predefined property or method. The prototype property is static, it cannot be accessed from an instance of the RegExp object, only RegExp.prototype is allowed.

(*) - The property is inherited from the RegExp.prototype.

Methods:

Method Support Description
compile (newPattern, [flags])*
Replaces the pattern of the RegExp object with the given newPattern, and compiles it into an internal format for faster execution.
pattern Required. String that specifies the pattern to compile.
flag Optional. String that can be the combination of the modifier flags ('g' (global), 'i' (ignorecase), 'm' (multiline)).
See Example 4 for details.
exec (string)*
Executes a regular expression search within a string, and returns the results in an array, or null.
The returned array contains details about the match. To read more about the returned array, please click here.
string Required. Specifies the input string.
test (string)*
Executes a regular expression search within a string, and returns a Boolean value that indicates whether a match was found. You can use the static RegExp.$n and RegExp.index properties and the non-static RegExp.lastIndex property to get details about the match.
string Required. Specifies the input string.
toSource ( )*
Returns a string representing the source code of the current regular expression.
toString ( )*
Returns a string representing the value of the current regular expression.
When a regular expression needs to be converted to a string, the JavaScript interpreter automatically calls its toString method.

(*) - The method is inherited from the RegExp.prototype.

Working with regular expressions:

Example 1:

Find the first occurrence of a compound word:
var str = "one cannot hear oneself speak here";
var re = /one\S+/;
var match = re.exec(str);
if (match) {
    document.write (match[0]);
}
Did you find this example helpful? yes no
Returns only the 'oneself' word, because the RegExp pattern describes that the 'one' must be followed by at least one non-space character.

Example 2:

Find the first occurrence of a number with at least two digits:
var str = "3 is to 6 as 6 is to 12";
var re = /\d{2}/;
var match = re.exec(str);
if (match) {
    document.write (match[0]);
}
Did you find this example helpful? yes no
The output is '12', because the RegExp pattern matches two digits.

Example 3:

Searching for words and display the start and end positions of the matches:
var str = "The apple is a tree.";
var re = /\w+/g;
var match;
while ((match = re.exec(str)) != null) {
    document.write (match.index + "-" + re.lastIndex + " " + match[0]);
    document.write ("<br />");
}
Did you find this example helpful? yes no
The output:
0-3 The
4-9 apple
10-12 is
13-14 a
15-19 tree

Example 4:

Implementing the previous example with the compile method:
var str = "The apple is a tree.";
var re = new RegExp;
re.compile ("\\w+", "g");
var match;
while ((match = re.exec(str)) != null) {
    document.write (match.index + "-" + re.lastIndex + " " + match[0]);
    document.write ("<br />");
}
Did you find this example helpful? yes no
The output:
0-3 The
4-9 apple
10-12 is
13-14 a
15-19 tree

Example 5:

Replacing some content with the String object's replace method:
var str = "man and woman are persons";
var re = /(wo)?man/g;
var newStr = str.replace (re, "person");
document.write (newStr);
Did you find this example helpful? yes no
The output:
person and person are persons

Example 6:

Converting a date to another format with the use of the String object's replace method and captured subexpressions.
var re = /(\d{2})\W(\d{2})\W(\d{4})/;
var str = "06/12/2009";
var newStr = str.replace (re, "$3-$2-$1");
document.write (newStr);
Did you find this example helpful? yes no
The output is 2009-12-06, because
$1 identifies the substring matched by the first parenthesized subexpression (\d{2}),
$2 identifies the substring matched by the second parenthesized subexpression (\d{2}),
$3 identifies the substring matched by the third parenthesized subexpression (\d{4}).

Example 7:

Implementing the previous example with the exec method, and with the elements of the returned array:
var re = /(\d{2})\W(\d{2})\W(\d{4})/;
var str = "06/12/2009";
var match = re.exec(str);
if (match) {
    var newStr = match[3] + "-" + match[2] + "-" + match[1];
    document.write (newStr);
}
Did you find this example helpful? yes no
The output is 2009-12-06.

Example 8:

Implementing the previous example with the exec method, and with the $1...$9 properties:
var re = /(\d{2})\W(\d{2})\W(\d{4})/;
var str = "06/12/2009";
var match = re.exec(str);
if (match) {
    var newStr = RegExp.$3 + "-" + RegExp.$2 + "-" + RegExp.$1;
    document.write (newStr);
}
Did you find this example helpful? yes no
The output is 2009-12-06.

Additional members of an array returned by the exec method:

The exec method returns an instance of a standard Array object, extended with properties shown in the following table.

Properties:

Property Support Description
index
Returns the zero-based start position of the match in the input string.
input
Returns the input string.
lastIndex
Returns the zero-based end position of the match in the input string. This property is only supported by Internet Explorer, use the lastIndex property of the RegExp object instead.
The first element of the returned Array object is the matching substring in the input string. If the pattern of the regular expression contains parenthesized subexpressions, the remaining array elements contain the matching substrings. See Example 7 for details.

External links:

User Contributed Comments

Post Content

Post Content