Design Development JavaScript

Introduction to Regular Expressions in JavaScript

Introduction to Regular Expressions in JavaScript feature image

In this tutorial, you will learn all you need to get started with regular expressions in JavaScript. You will learn how to create new expressions, how to use them and how to test them. You will also learn how to create simple and complex patterns and about special symbols and characters.

Introduction

He first thing we should clarify is what are Regular expressions. Regular expressions are a way to describe patterns, or rules if you want. You can then use these patterns on strings to check if those strings contain, or match, those patterns. One good thing on Regular expressions is that you can use them in many programming languages.

Regular expressions are not just another part of JavaScript, like some feature or something like that. They are basically a small language, a language that is independent of other languages. Another good thing is that Regular expressions can be incredibly useful. They can help you do incredible things with strings with very little code.

The bad thing is that Regular expressions often look weird, even scary. This is especially true about more complex patterns. This is also one reason many programmers are not really excited to learn about them. That is a mistake. Regular expressions can be really powerful and save you a lot of code. I hope this tutorial will help you overcome this.

How to create Regular expressions

If you want to create regular expression in JavaScript, or describe some pattern, there are two ways to do it.

Regular expression constructor

The first one is by using regular expression constructor. This is a fancy name for a constructor function that exists on RegExp object. This constructor accepts two parameters. The first parameter is the pattern you want to describe. This parameter is not optional. In the end, why create regular expression with any pattern?

The second parameter is a string with flags. Don’t worry, you will learn about flags soon. This parameter is optional. One thing you should remember about flags is that you can’t add them or remove them later, after creating the regular expression. So, if you want to use any flag, make sure to add it when you create the regular expression.

// Regular expression constructor syntax
new RegExp(pattern[, flags])


// Create regular expression
// with Regular expression constructor
// without any flags
const myPattern = new RegExp('[a-z]')


// Create regular expression
// with Regular expression constructor
// with one flag
const myPattern = new RegExp('[a-z]', 'g')

Regular expression literal

The second way to create regular expression is by using regular expression literal. Just as regular expression constructor, Regular expression literal is also made of two parts. The first one is the pattern you want to describe. This pattern is wrapped with forward slashes (//). The second are flags that follows after the closing slash. Flags are optional.

// Regular expression literal syntax
/pattern/flags


// Create regular expression
// with regular expression literal
// without any flags
const myPattern = /[a-z]/


// Create regular expression
// with regular expression literal
// with one flag
const myPattern = /[a-z]/g

Note: The regular expression literal uses forward slashes enclose the pattern you want to describe. If you want to add one or more forward slashes as a part of the pattern, you have to escape them with a backslash (\), i.e. \/.

Regular expression constructor or literal

The constructor and the literal are similar, but there is one important difference. The regular expression constructor is compiled during runtime. The regular expression literal is compiled when your script is loaded. This means that the literal can’t be changed dynamically, while the constructor can.

So, if you need, or might need, to change the pattern on the fly create regular expressions with constructor, not literal. The same applies if you will create patterns on the fly. In that case, constructor is better choice. On the other hand, if you don’t need to change the pattern, or create it later, use literal.

How to use regular expressions with RegExp methods

Before we get to how to create patterns, let’s quickly discuss how to use these patterns. Thanks to this we will be able to use these methods later to test various ways to how to create patterns.

test()

There are a couple of methods you can use when you work with regular expressions. One of the simplest is test(). You pass the text you want to test as an argument when you use this method. When used, this method returns a Boolean, true if the string contains a match of your pattern or false if it doesn’t.

// test() syntax
// /somePattern/.test('Some text to test')


// Passing a string
// When test() doesn't find any match
myPattern.test('There was a cat and dog in the house.')
// false


// Using a variable
// Create text for testing
const myString = 'The world of code.'

// Create pattern
const myPattern = /code/

// Test the text given a pattern
// When test() finds a match
myPattern.test(myString)
// true

exec()

Another method you can use is exec(). If there is a match, the exec() method returns an array. This array contains information about the pattern you used, index at which the pattern was found, input, or the text you’ve been testing, and any groups. If there is not a match, exec() method returns null.

One thing to remember. The exec() method will return information only about the first match in the text. When it finds the first match, it stops. Don’t use it if you want to get multiple matches.

// exec() syntax
// /somePattern/.exec('Some text to test')


// Create some string for testing
const myString = 'The world of code is not full of code.'

// Describe pattern
const myPattern = /code/

// Use exec() to test the text
// When exec() finds a match
myPattern.exec(myString)
// [
//   'code',
//   index: 13,
//   input: 'The world of code is not full of code.',
//   groups: undefined
// ]


// Describe another pattern
const myPatternTwo = /JavaScript/

// Use exec() to test the text again with new pattern
// When exec() doesn't find any match
myPatternTwo.exec(myString)
// null

How to use regular expressions with String methods

The test() and exec() are not the only methods you can use to test for matches of a pattern in a string. There is also search(), match() and matchAll(). These methods are different they don’t exist on RegExp object, but strings. However, they allow you to use regular expressions.

When you want to use these methods, you have to flip the syntax. You call these methods on strings, not patterns. And, instead of passing the string you want to test as an argument, you pass the pattern.

search()

The first one, search(), searches a string and looks for given pattern. When it finds a match, it returns the index at which the :match begins. If it doesn’t find any match, it returns -1. One thing to remember about search(). It will only return the index of the first match in the text. When it finds the first match, it stops.

// search() syntax
// 'Some text to test'.search(/somePattern/)


// Create some text for testing
const myString = 'The world of code is not full of code.'

// Describe pattern
const myPattern = /code/

// Use search() to search the text for the pattern
// When search() finds a match
myString.search(myPattern)
// -13


// Call search() directly on the string
// When search() doesn't find any match
'Another day in the life.'.search(myPattern)
// -1

match()

The match() is a second String method that allows you to use regular expressions. This method works similarly to the exec(). If it finds a match, the match() method returns an array, with information about the pattern you used, index at which the pattern was found, the text and any groups.

Also like exec(), if there is no match, the match() method returns null. When you use match() to search for pattern with g flag, to find all matches, it will return array with all matches.

// match() syntax
// 'Some text to test'.match(/somePattern/)


// Create some text for testing
const myString = 'The world of code is not full of code.'

// Describe pattern
const myPattern = /code/

// Use match() to find any match in the ext
myString.match(myPattern)
// [
//   'code',
//   index: 13,
//   input: 'The world of code is not full of code.',
//   groups: undefined
// ]

'Another day in the life.'.match(myPattern)
// null


// Use match() to find all matches
// Create some text for testing
const myString = 'The world of code is not full of code.'

// Describe pattern
const myPattern = /code/g // add 'g' flag

// Use match() to find any match in the ext
myString.match(myPattern)
// [ 'code', 'code' ]

matchAll()

Similarly to match(), the matchAll() method can also return all matches if you use g flag in the pattern. However, it works differently. The matchAll() method returns an RegExp String Iterator object. When you want to get all matches from this object there are few things you can do.

First, you can use for...of loop to iterate over the object and return or log each match. You can also use Array.from() to create an array from the content of the object. Or, you can use spread operator which will achieve the same result as Array.from().

// match() syntax
// 'Some text to test'.match(/somePattern/)

// Create some text for testing
const myString = 'The world of code is not full of code.'

// Describe pattern
const myPattern = /code/g // Note we are using 'g' flag

// Use matchAll() to find any match in the ext
const matches = myString.matchAll(myPattern)

// Use for...of loop to get all matches
for (const match of matches) {
  console.log(match)
}
// [
//   [
//     'code',
//     index: 13,
//     input: 'The world of code is not full of code.',
//     groups: undefined
//   ],
//   [
//     'code',
//     index: 33,
//     input: 'The world of code is not full of code.',
//     groups: undefined
//   ]
// ]


// Use Array.from() to get all matches
const matches = Array.from(myString.matchAll(myPattern))
// [
//   [
//     'code',
//     index: 13,
//     input: 'The world of code is not full of code.',
//     groups: undefined
//   ],
//   [
//     'code',
//     index: 33,
//     input: 'The world of code is not full of code.',
//     groups: undefined
//   ]
// ]


// Use spread operator to get all matches
const matches = [...myString.matchAll(myPattern)]
// [
//   [
//     'code',
//     index: 13,
//     input: 'The world of code is not full of code.',
//     groups: undefined
//   ],
//   [
//     'code',
//     index: 33,
//     input: 'The world of code is not full of code.',
//     groups: undefined
//   ]
// ]

How to create simple patterns

You know how to create regular expressions and how to test them.Let’s take a look at how to create patterns. The easiest way to create regular expressions is by using simple patterns. This means using a string with some specific text. Then, you can try if some string matches that pattern (text).

// Create simple pattern
// with regular expression literal
const myPattern = /JavaScript/

// Test a string with the pattern
myPattern.test('One of the most popular languages is also JavaScript.')
// true

// Test a string with the pattern
myPattern.test('What happens if you combine Java with scripting?')
// false

How to create complex patterns with special symbols and characters

So far, we used regular expressions made of simple patterns. These patterns can be enough for some simple cases. However, these simple patterns are not enough when we deal with more complex cases. This is the time when we have to use create more complex patterns. It is where special symbols and characters come into play. Let’s take at those that are used most often in regular expressions.

Character classes

Character classes are like shortcuts to different types of characters. For example, there is a character classes for letters, digits, space, etc.

/* Character class - Meaning */
. - Matches any character except for newline.
\d - Matches a single digit (same as [0-9]).
\w - Matches a single alphanumeric word character in Latin alphabet, including underscore (same as [A-Za-z0-9_
\s - Matches a single white space character (space, tab, etc.) (same as [\t\r\n\v\f])
\D - Matches a single character that is not a digit (same as [^0-9])
\W - Matches a single character that is not a word character in Latin alphabet (same as [^A-Za-z0-9_])
\S - Matches a single non-white space character (same as [^\t\r\n\v\f]).

Examples:

// . - Matches any character except for newline
const myPattern = /./

console.log(myPattern.test(''))
// false

console.log(myPattern.test('word'))
// true

console.log(myPattern.test('9'))
// true


// \d - Matches a single digit
const myPattern = /\d/

console.log(myPattern.test('3'))
// true

console.log(myPattern.test('word'))
// false


// \w - Matches a single alphanumeric word character
const myPattern = /\w/

console.log(myPattern.test(''))
// false

console.log(myPattern.test('word'))
// true

console.log(myPattern.test('9'))
// true


// \s - Matches a single white space character
const myPattern = /\s/

console.log(myPattern.test(''))
// false

console.log(myPattern.test(' '))
// true

console.log(myPattern.test('foo'))
// false


// \D - Matches a single character that is not a digit
const myPattern = /\D/

console.log(myPattern.test('Worm'))
// true

console.log(myPattern.test('1'))
// false


// \W - Matches a single character that is not a word character
const myPattern = /\W/

console.log(myPattern.test('Worm'))
// false

console.log(myPattern.test('1'))
// false

console.log(myPattern.test('*'))
// true

console.log(myPattern.test(' '))
// true


// \S - Matches a single non-white space character
const myPattern = /\S/

console.log(myPattern.test('clap'))
// true

console.log(myPattern.test(''))
// false

console.log(myPattern.test('-'))
// true

Assertions

Another set of special characters are assertions. These symbols allow you to describe patterns such as boundaries, i.e. where words and lines beginning and where they end. Assertions also allow to describe more advanced patterns such as look-ahead, look-behind, and conditional expressions.

/* Assertion - Meaning */
^ - Matches the beginning of the string (regular expression that follows it should be at the start of the test string).
$ - Matches the end of the string (regular expression that follows it should be at the end of the test string).
\b - Matches word boundary. A match at the beginning or ending of a word.
\B - Matches a non-word boundary.
x(?=y) - Lookahead assertion. It matches "x" only if "x" is followed by "y".
x(?!y) - Negative lookahead assertion. It matches "x" only if "x" is not followed by "y".
(?<=y)x - Lookbehind assertion. It matches "x" only if "x" is preceded by "y".
(?<!y)x - Negative lookbehind assertion. It matches "x" only if "x" is not preceded by "y".

Examples:

// ^ - The beginning of the string
const myPattern = /^re/

console.log(myPattern.test('write'))
// false

console.log(myPattern.test('read'))
// true

console.log(myPattern.test('real'))
// true

console.log(myPattern.test('free'))
// false


// $ - The end of the string
const myPattern = /ne$/

console.log(myPattern.test('all is done'))
// true

console.log(myPattern.test('on the phone'))
// true

console.log(myPattern.test('in Rome'))
// false

console.log(myPattern.test('Buy toner'))
// false


// \b - Word boundary
const myPattern = /\bro/

console.log(myPattern.test('road'))
// true

console.log(myPattern.test('steep'))
// false

console.log(myPattern.test('umbro'))
// false

// Or
const myPattern = /\btea\b/

console.log(myPattern.test('tea'))
// true

console.log(myPattern.test('steap'))
// false

console.log(myPattern.test('tear'))
// false


// \B - Non-word boundary
const myPattern = /\Btea\B/

console.log(myPattern.test('tea'))
// false

console.log(myPattern.test('steap'))
// true

console.log(myPattern.test('tear'))
// false


// x(?=y) - Lookahead assertion
const myPattern = /doo(?=dle)/

console.log(myPattern.test('poodle'))
// false

console.log(myPattern.test('doodle'))
// true

console.log(myPattern.test('moodle'))
// false


// x(?!y) - Negative lookahead assertion
const myPattern = /gl(?!u)/

console.log(myPattern.test('glue'))
// false

console.log(myPattern.test('gleam'))
// true


// (?<=y)x - Lookbehind assertion
const myPattern = /(?<=re)a/

console.log(myPattern.test('realm'))
// true

console.log(myPattern.test('read'))
// true

console.log(myPattern.test('rest'))
// false


// (?<!y)x - Negative lookbehind assertion
const myPattern = /(?<!re)a/

console.log(myPattern.test('break'))
// false

console.log(myPattern.test('treat'))
// false

console.log(myPattern.test('take'))
// true

Quantifiers

When you want to specify numbers of characters or expressions you want to match you can use quantifiers.

/* Quantifier - Meaning */
* - Matches the preceding expression 0 or more times.
+ - Matches the preceding expression 1 or more times.
? - Preceding expression is optional (i.e. matches 0 or 1 times).
x{n} - The "n" must be a positive integer. It matches exactly "n" occurrences of the preceding "x".
x{n, } - The "n" must be a positive integer. It matches at least "n" occurrences of the preceding "x".
x{n, m} - The "n" can be 0 or a positive integer. The "m" is a positive integer. If "m" > "n", it matches at least "n" and at most "m" occurrences of the preceding "x".

Examples:

// * - Matches preceding expression 0 or more times
const myPattern = /bo*k/

console.log(myPattern.test('b'))
// false

console.log(myPattern.test('bk'))
// true

console.log(myPattern.test('bok'))
// true


// + - Matches preceding expression 1 or more times
const myPattern = /\d+/

console.log(myPattern.test('word'))
// false

console.log(myPattern.test(13))
// true


// ? - Preceding expression is optional, matches 0 or 1 times
const myPattern = /foo?bar/

console.log(myPattern.test('foobar'))
// true

console.log(myPattern.test('fooobar'))
// false


// x{n} - Matches exactly "n" occurrences of the preceding "x"
const myPattern = /bo{2}m/

console.log(myPattern.test('bom'))
// false

console.log(myPattern.test('boom'))
// true

console.log(myPattern.test('booom'))
// false


// x{n, } - Matches at least "n" occurrences of the preceding "x"
const myPattern = /do{2,}r/

console.log(myPattern.test('dor'))
// false

console.log(myPattern.test('door'))
// true

console.log(myPattern.test('dooor'))
// true


// x{n, m} - Matches at least "n" and at most "m" occurrences of the preceding "x".
const myPattern = /zo{1,3}m/

console.log(myPattern.test('zom'))
// false

console.log(myPattern.test('zoom'))
// true

console.log(myPattern.test('zooom'))
// true

console.log(myPattern.test('zoooom'))
// false

Groups and ranges

Groups and ranges are useful when you want to specify groups special characters, or their ranges.

/* Group or range - Meaning */
[abc] - Matches any single character in the string from characters inside the brackets.
[^abc] — Matches anything that is not inside the brackets.
[a-z] - Matches any characters in the rage from "a" to "z".
[^a-z] - Matches any characters that are not in the rage from "a" to "z".
(x) - Matches x and remembers it so we can use it later.
(?<name>x) - Creates a capturing group that can be referenced via the specified name.
(?:x) - Matches "x" but does not remember the match so the match can't be extracted from the resulting array of elements

Examples:

// [abc] - Matches any single character from characters inside the brackets
const myPattern = /[aei]/

console.log(myPattern.test('aei'))
// true (there is a, e, i)

console.log(myPattern.test('form'))
// false (there is not a, e or i)


// [^abc] - Matches anything that is not inside the brackets.
const myPattern = /[^aei]/

console.log(myPattern.test('aei'))
// false (there no other character than a, e and i)

console.log(myPattern.test('form'))
// true (there are other characters than a, e and i)


// [a-z] - Matches any characters in the rage from "a" to "z".
const myPattern = /[b-g]/

console.log(myPattern.test('bcd'))
// true (there are characters in range from 'b' to 'g')

console.log(myPattern.test('jklm'))
// false (there are no characters in range from 'b' to 'g')


// [^a-z] - Matches any characters that are not in the rage from "a" to "z".
const myPattern = /[^b-g]/

console.log(myPattern.test('bcd'))
// false (there are no other characters than those in range from 'b' to 'g')

console.log(myPattern.test('jklm'))
// true (there are other characters than those in range from 'b' to 'g')


// (x) - Matches x and remembers it so we can use it later.
const myPattern = /(na)da\1/

console.log(myPattern.test('nadana'))
// true - the \1 remembers and uses the 'na' match from first expression within parentheses.

console.log(myPattern.test('nada'))
// false


// (?<name>x) - Creates a capturing group that can be referenced via the specified name.
const myPattern = /(?<foo>is)/

console.log(myPattern.test('Work is created.'))
// true

console.log(myPattern.test('Just a text'))
// false


// (?:x) - Matches "x" but does not remember the match
const myPattern = /(?:war)/

console.log(myPattern.test('warsawwar'))
// true

console.log(myPattern.test('arsaw'))
// false

Alternations

Alternations allows you to match at least of multiple expressions.

/* Alternation - Meaning */
| - Matches the expression before or after the |. Acts like a boolean OR (||).

Examples:

// | - Matches the expression before or after the |
const myPattern = /(black|white) swan/

console.log(myPattern.test('black swan'))
// true

console.log(myPattern.test('white swan'))
// true

console.log(myPattern.test('gray swan'))
// false

Flags

Flags are the last type of symbols you can use in regular expressions. Flags offer a simple way to make your patterns more powerful. For example, flags allow to ignore case of the letters so the pattern can match both upper and lower case, find multiple matches, find matches in multiline text, etc.

/* Flag - Meaning */
g – Search globally, i.e. don't stop after the first match.
i – Ignore case, i.e. match both upper and lower case.
s - When you use it with ., . can match newline characters.
m – Multi-line input, starts with "^", ends with "$", i.e. match the beginning or end of each line.

Examples:

// g flag - Search globally
const myPattern = /xyz/g

console.log(myPattern.test('One xyz and one more xyz'))
// true


// i flag - Ignore case
const myPattern = /xyz/i

console.log(myPattern.test('XyZ'))
// true - the case of characters doesn't matter in case-insensitive search.


// s flag - When you use it with ., . can match newline characters
const myPattern = /foo.bar/s

console.log(myPattern.test('foo\nbar'))
// true

console.log(myPattern.test('foo bar'))
// true

console.log(myPattern.test('foobar'))
// false

Conclusion: Introduction to Regular Expressions in JavaScript

Regular expressions can be difficult to understand and learn. However, they can be very useful tools for solving difficult and complex problems with little code. This makes any struggles worth it. I hope this tutorial helped you understand how regular expressions work and how you can use them.

If you liked this article, please subscribe so you don't miss any future post.

If you'd like to support me and this blog, you can become a patron, or you can buy me a coffee 🙂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.