JavaScript

Regex – Introduction to Regular Expression Pt2

Regex

In the first part you explored the basics or Regex – what it is and how it works – along with few examples. So it was focused more on theory. Today, you are going to switch to the practical side and practice on various examples. In these snippets you will also work with some methods included in the String object. If you are not familiar with strings, you might first check one of the previous tutorials covering this topic.

In Regex you can use many sequences, characters and other entities to form an expression for later use. We covered these entities in previous part. However, I will include the list of them also here so you don’t have to switch between the pages. The list is below and with that said, you can continue in exploring the world of Regex.

Special sequences:

- . - dot
- \d – any digit: [0-9]
- \D – any character (not a digit): [^0-9]
- \w – any digit, a letter (lowercase or capital) or underscore: [0-9a-zA-Z_]
- \W – any character which is not a digit, a letter, and an underscore: [^0-9a-zA-Z_]
- \s – any whitespace: [ \t\r\n\v\f]
- \S – any non-whitespace: [^ \t\r\n\v\f]
- note: “^” will negate whole set like in examples in list

Special characters:

- \n – new line (0x0A)
- \f – page break (0x0C)
- \r – “carriage return” (0x0D)
- \t – horizontal tab character (0×09)
- \v – vertical tab character (0x0B)

Repetitions:

- {n} – exactly n occurrences of the preceding expression
- {n,} – n or more occurrences of the preceding expression
- {n,m} – from n to m occurrences of the preceding expression
- ? – preceding item is optional (may occur 0 or 1 time)
- + – preceding element can occur one or more times
- * – preceding element can occur zero or more times

Flags:

- g – search globally
- i – ignore case sensitive
- m – multi-line input, starts with “^”, ends with “$”; in other words processing is applied to a string containing multiple lines

– note: RegExr is a great site to practice working with Regular expressions. You can also try JSBin or Codepen.

Available methods for Regex

You are already familiar with the methods included in Regex object (exec(), test(), toString()). However, this is not the end of the road. As you know, Regex works with strings. This gives you ability to use methods from String object along with Regex to achieve what you want. These methods are match(), search(), replace() and split(). Let’s have a look at each of them separately, understand how they work and then practice on couple examples.

match()

The first method is match(). With this method you can use the expression to match string you need. If you use expression with g flag (search globally), it will return only the first occurrence or null if there is no match at all. With g flag it will return an array containing all matches from whole string. Let’s take some text and try to match simple word.

JavaScript:

// dummy text
var string = "Tousled messenger bag 3 wolf moon aesthetic cold-pressed umami, pour-over distillery Kickstarter Marfa shabby chic salvia Portland fixie roof party. Cupidatat Shoreditch pork belly Kickstarter. Tumblr skateboard mlkshk, sapiente umami direct trade fashion axe PBR roof party. Bushwick veniam aute, sartorial reprehenderit laboris ut qui synth kale chips. Helvetica Intelligentsia shabby chic placeat. Art party farm-to-table veniam next level magna Pitchfork. Cardigan disrupt Thundercats, before they sold out Blue Bottle exercitation gastropub pariatur bicycle rights McSweeney's Neutra fashion axe gluten-free locavore excepteur.";

// match the word roof in global search
var result = string.match(/roof/g);

console.log(result); 
// result - [“roof”, “roof”]

OK. That was too easy. Let’s try to match six letter words. This can be done with use of “\b” entity which marks a word boundary like start or end of the string, whitespace and punctuation. Because we want the word to have exactly six letters, we have to use the “\b” on the beginning of expression to mark the start of the word and also on the end so no longer words will be returned. Next you will need “\w” to include any character, digit or underscore followed by “{6}”. This, in combination with “\w”, means six repetitions of any word, etc. in one string.

JavaScript:

// Match method for six letter words
var result = string.match(/\b\w{6}\b/g);

// result - ["shabby", "salvia", "Tumblr", "mlkshk", "direct", "veniam", "shabby", "veniam", "before", "Bottle", "rights", "Neutra", "gluten"]

You can use similar expression to match a group of numbers or two or a mobile phone number. So, let’s say you want to match only number composed of three groups of numbers with three numbers in each group. Again, you will use “\b” to mark the start and end of each group. Instead of “\w” you will use “\d” for digit followed by “{3}” (three numbers). This token (\b\d{3}) will be repeated three times (three three-digit groups). Between the first two and last two tokens will be a square brackets containg “(whitespace)-.”. This says the groups of numbers can be separated by whitespace, coma or dot.

However, there might be a case of a phone number written in one chunk. Don’t worry about that, you are covered. Just use “|” (like OR operator) followed by similar token you uused for three-digit string only now you will use “{9}”. Don’t forget the “g” flag if you want more then first occurrence to be returned.

JavaScript:

// nine-digit number
var example = “123-956-225, 122563, 246 324 889, 656 2336, 664-484-2332, 123456789”;

// Match method
var number = example.match(/\b\d{3}\b[ -.]?\d{3}[ -.]?\d{3}\b|\b\d{9}\b/); 

// result - [“123-956-225”, “246 324 889”, “123456789”]

-note: You had better not use this snippet for number validation since it’s too simple.

search()

Next method is search(). This one will match the string with Regex and return the index of the beginning of the match. Otherwise, it will return -1. It will return only first occurrence so you don’t have to use “g” flag (it will not work anyway). Let’s use the previous example and look only for nine-digit number.

JavaScript:

// String example
var example = “123-956-225, 122563, 246 324 889, 656 2336, 664-484-2332, 123456789”;

// Search for nine-digit string
console.log(example.search(/\b\d{9}\b/));

// result – 58

Let’s use the first example with dummy text and searched for “roof” using search() method. Don’t forget that the result will be only index of the first occurrence no matter how many matches (2) are in the string.

JavaScript:

var example.search(/roof/);
console.log(example);

// result – 137

replace()

Another method in String object you can use with Regex is replace(). This method takes two parameters. First is the pattern you are looking for and second is its replacement. How about replacing every five-letter word with number five?

JavaScript:

var result = example.replace(/\b\w{5}\b/g, “5”);
console.log(result);

// result – try it yourself ...

split()

The last method is split(). This method will take the string, cut it into individual chunks according to the matches and return an array. The easiest example can be splitting some text individual words. Just like with search() method you don’t have to include “g” flag. It will go through whole string automatically.

JavaScript:

// example
var example = “Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam.”

// Pattern – with whitespace
var result1 = example.split(/ /);

// Pattern – with token for whitespace
var result2 = example.split(/\s/);

You can use letters or digits or words to cut the string as well, but don’t forget that all the characters you use in the pattern will be cut out (not included in result).

JavaScript:

// Variable with example text
var example = “This might not be a good idea.”;

// Splitting
var result = example.split(/o/);

console.log(result);
// result - [“This might n”, “t be a g”, “”, “d idea.”]

And that’s all for today. I hope this short and quick intro to Regular expression was useful for you and you enjoyed it. If you liked it, please share this post so other people can learn and benefit from Regex as well.

If you liked this article, then please consider subscribing.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.