Alternation, groups and backreferences are commonly used in regular expressions. Groups and backreferences are a very useful thing for replacing terms. In this part, we are going to take a look at how to use these features.

Alternation

Like the word itself suggests, alternation gives us a choice of alternate patterns to match. Lets say we want to find the occurrence of some specific words inside a sentense. The example below shows us how to use alternation:

Regex:

dog|cat

Matches:

The dog is a mammal.
The fish is a mammal.
The cat is a mammal.

In the example above, we wanted to find occurrences either of dog or cat, so we used the pipe character (|) to tell the regex engine to alternet between those two terms.

We can also assimilate alternation to the logic comparator OR, but this time in a feature that is proper for that kind of operation.

A good practice is to use alternation inside a group, just so it doesn't mix with the other characters in our expression. See the example below:

Regex:

(t|T)h(e|eir)

Matches:

the
The
their
Their

In this example, we created two alternation groups. The first one captures both uppercase and lowercase character t. The second one captures an e or eir sequence of characters. So, it matched all the cases above. We are going to see how to use groups in the next section.

Groups

Groups are widely used in regular expressions because of its practicality in gathering terms and organizing them. It also brings the benefit of a better reading and understanding of what is happening in every part of our regular expressions.

In order to create a group, we must open a pair of parenthesis (()) and put our terms inside of it. See the example below:

Regex:

([A-D]): (\d\d\d)

Matches:

Team A: 010 points.
Team B: 018 points.
Team C: 031 points.
Team D: 022 points.

In this example, we created two groups. The first has a character class that matches letters from A to D. The second group matches three digits. Between them there are some other characters such as a whitespace.

These groups can be captured by the programming language regex library in order to be used inside the code for many purposes. This is a very useful feature, since you can define the groups of patterns you want your regex to capture.

In the example above, we could classify the first group as the "Team Name" and the second one as "Score". Normally, the match function of the programming languages returns an array with all these matches separately, so we can easily work with them inside our application.

Backreference

There is another feature that allows us to reuse a captured group inside the own regular expression, so we don't have to write the whole pattern of the group multiple times. It is called backreference. In order to use that, we should use the backslash character (\) followed by a number that determinates the group index. It is easy to notice this in the example below:

Regex:

"(be) or not to \1"

Matches:

"be or not to be"

In this example, we captured a group with the word be, and later on we reused that by using a backreference. Since we created a group, it automatically receives a index that starts by 1, which means this is the first group captured. If we created more groups, they would receive indexes as well according to the order they appear in the expression.