Categories:

Programmer's Guide to Regular Expressions

Credits: This tutorial is written by David Andersson (Liorean). Please see footnote for more information on author.

What is a regular expression?

Regular expressions is a form of pattern matching that you can apply on textual content. Take for example the DOS wildcards ? and * which you can use when you're searching for a file. That is a kind of very limited subset of RegExp. For instance, if you want to find all files beginning with "fn", followed by 1 to 4 random characters, and ending with "ht.txt", you can't do that with the usual DOS wildcards. RegExp, on the other hand, could handle that and much more complicated patterns.

Regular expressions are, in short, a way to effectively handle data, search and replace strings, and provide extended string handling. Often a regular expression can in itself provide string handling that other functionalities such as the built-in string methods and properties can only do if you use them in a complicated function or loop.

RegExp Syntax

There are two ways of defining regular expressions in JavaScript — one through an object constructor and one through a literal. The object can be changed at runtime, but the literal is compiled at load of the script, and provides better performance. The literal is the best to use with known regular expressions, while the constructor is better for dynamically constructed regular expressions such as those from user input. In almost all cases you can use either way to define a regular expression, and they will be handled in exactly the same way no matter how you declare them.

Declaration

Here are the ways to declare a regular expression in JavaScript. While other languages such as PHP or VBScript use other delimiters, in JavaScript you use forward slash (/) when you declare RegExp literals.

Syntax Example
RegExp Literal
/pattern/flags; var re = /mac/i;
RegExp Object Constructor
new RegExp("pattern","flags"); var re = new RegExp(window.prompt("Please input a regex.","yes|yeah"),"g");

Flags

There are three flags that you may use on a RegExp. The multiline flag is supported only in JavaScript1.5+, but the other two are supported in pretty much every browser that can handle RegExp (JavaScript.1.2+). These flags can be used in any order or combination, and are an integral part of the RegExp.

Flag Description
Global Search
g The global search flag makes the RegExp search for a pattern throughout the string, creating an array of all occurrences it can find matching the given pattern.
Ignore Case
i The ignore case flag makes a regular expression case insensitive. For international coders, note that this might not work on extended characters.
Multiline Input
m This flag makes the beginning of input (^) and end of input ($) codes also catch beginning and end of line respectively. JavaScript1.5+ only.