Categories:

Introductory Guide to regular expressions

Credits: This tutorial is written by Karen Gayda. Modified by JavaScriptKit.com for structure and added additional content/ examples. Please see footnote for more information on author.

Introduction

Validating user input is the bane of every software developer’s existence. When you are developing cross-browser web applications this task becomes even less enjoyable due to the lack of useful intrinsic validation functions in JavaScript. Fortunately, JavaScript 1.2+ has incorporated regular expressions. In this article I will present a brief tutorial on the basics of regular expressions and then give some examples of how they can be used to simplify data validation.

Regular Expressions and Patterns

Regular expressions are very powerful tools for performing pattern matches. PERL programmers and UNIX shell programmers have enjoyed the benefits of regular expressions for years. Once you master the pattern language, most validation tasks become trivial. You can perform complex tasks that once required lengthy procedures with just a few lines of code using regular expressions.

So how are regular expressions implemented in JavaScript? There are two ways:

1) Using literal syntax.
2) When you need to dynamically construct the regular expression, via the RegExp() constructor.

The literal syntax looks something like:

var RegularExpression = /pattern/

while the RegExp() constructor method looks like

var RegularExpression  =  new RegExp("pattern");

The RegExp() method allows you to dynamically construct the search pattern as a string, and is useful when the pattern is not known ahead of time.

To use regular expressions to validate a string you need to define a pattern string that represents the search criteria, then use a relevant string method to denote the action (ie: search, replace etc). Patterns are defined using string literal characters and metacharacters. For example, the following regular expression determines whether a string contains a valid 5-digit US postal code (for sake or simplicity, other possibilities are not considered):

<script>
function checkpostal(){
	var re5digit=/^\d{5}$/ //regular expression defining a 5 digit number
	if (document.myform.myinput.value.search(re5digit)==-1) //if match failed
		alert("Please enter a valid 5 digit number inside form")
	}
</script>

<form name="myform">
<input type="text" name="myinput" size=15>
<input type="button" onClick="checkpostal()" value="check">

</form>
Example (check input for 5 digit number): 

Lets deconstruct the regular expression used, which checks that a string contains a valid 5-digit number, and ONLY a 5-digit number:

var re5digit=/^\d{5}$/
  • ^ indicates the beginning of the string. Using a ^ metacharacter requires that the match start at the beginning.
  • \d indicates a digit character and the {5} following it means that there must be 5 consecutive digit characters.
  • $ indicates the end of the string. Using a $ metacharacter requires that the match end at the end of the string.

Translated to English, this pattern states: "Starting at the beginning of the string there must be nothing other than 5 digits. There must also be nothing following those 5 digits."

Now that you've got a taste of what regular expressions is all about, lets formally look at its syntax, so you can create complex expressions that validate virtually anything you want.