July 3rd 2020

Google Analytics RegEx: What is it?

Regular expressions, known as regex, are widely used and its purpose is to find specific patterns in a list of characters or symbols. In Google Analytics, regex can be used in so many ways, for example if we want to find all pages within a subfolder or all pages with containing a specific string of symbols and characters.

RegEx is a powerful and flexible tool and any marketer should have a minimum knowledge of how and when to use RegEx because it will most likely make your life easier and save you time.

This guide will explain how to use regular expressions in Google Analytics by using real life examples. So, let’s start with the most common combinations of regex.

Most Common Regular Expressions

a.) Pipe (|) – is one of the most simple examples and means “or” and can be used when we are looking for at least two different patterns. For example, we try to sort all the pages containing “/insights/” and “/seo/” in our GA reports. Using RegEx instructions, we will have the following line in our search field: “/insights/|/seo/”

Pipeline Regex Example

b.) Dot (.) – is a wildcard and will match any character. However, it is used just for one character. To understand it better, let’s take following example. We need to match all these words: book, look, took, cook; using RegEx out instruction will be “.ook”.

c.) Asterix (*) –  is used for matching the preceding character 0 or more times. A RegEx example in GA can be: “10*”; this matches: 1, 10, 100, 1000. The Asterix can be used easily in matching an IP address with 1 or more digits in the last section for one of our Exclusion Filters. For example, we want to exclude all the following IP addresses: 216.233.32.2, 216.233.32.34, 216.233.33.567. Using the following RegEx instructions, we can create a custom filter in our GA View: 216\.233\.32\.\d* [We use the backslash to escape the decimal and we use \d to match any digit, more on this later].

d.) Dot-asterix (.*) – is one of the most used combinations in Google Analytics and is a really handy ,especially when you want to create that advanced filter to show the full URL in your GA account. Using (.*) will match zero or more random characters, so basically will match everything.

Let’s say that  we need to group some categories from our e-commerce store and we need to analyze their performance. We have the following categories: “/products/men/shirts/”, “/product/women/shirts/”, “/product/kids/shirts/”. We can use the RegEx instruction: “/product/.*/shirts/” to match all 3 categories.

e.) Backslash (\) – definitely is one of the most used regular expressions and is used to turn special characters into normal ones. What does this mean? Using the previous example, IP addresses exclusion, we used a backslash for the dot between the strings of characters. This way, we are defining the dot as normal character instead of being used as a RegEx instruction.

f.) Caret (^) – means that something begins with. For example, we want to match “lunch”, “lunches” queries. For this, we are using “^lunch” instruction in our GA table view. However, queries such as “light lunch” or “light lunches” will not be considered by our RegEx.

g.) Dollar sign ($) – is the opposite of the caret; means that something ends with. Using the previous lunch example, “lunch$” will match “lunch”, “light lunch”, but will not match “light lunches” or “light lunch price”.

h.) Question mark (?) – means that the last character is optional and is normally used for targeting misspellings. For example: “regg?ex” will match “reggex” and “regex”.

i.) Parentheses () – can be used to group queries together and works in the same way as in mathematics. Using our previous example: “/products/men/shirts/”, “/product/women/shirts/”, “/product/kids/shirts/” we can have an indubitable match for all 3 categories by using the following RegEx: “^/product/(men|women|kids)/shirts/$”.

j.) Square brackets ([]) – is a really helpful instruction, especially if you are looking to create a powerful list. For example, “t[aeo]p” will match: tap, tep and top.

k.) Dashes (-) – are normally used in combination with square brackets and you most likely use these types of regular expressions in more advanced filters. Some examples of combinations are: [a-z], [A-Z], [0-9], [a-zA-Z0-9] where this will match lower-case letters, upper-case letters, all numbers, respectively all lower-case and upper-case letters and numbers.

l.) Plus sign (+) – it matched one or more of the previous characters. For example: “item1+” will match “item1”, “item11”, “item111”.

m.) Curly brackets ({}) – the easiest way to explain this one is through examples. Let’s say that we have a series of IP addresses we want to exclude from our Google Analytics view. The IPs are from 77.120.121.0 to 77.120.121.99. Our RegEx would look like this “^77\.120\.121\.[0-9]{1,2}$”.

A second example can be used to march product SKUs. For example item 1 in our e-commerce store has SKU AB100, item 2 has SKU AB200 and any variations of these items have SKUs such as AB101, AB102 or AB201, AB202. To match all these SKUs, we can use the following RegEX : “AB[0-9]{3}”. Basically, we match everything under ABxxx.

Most Effective Ways to Use Google Analytics RegEx

As we mentioned before, even a basic understanding of regular expressions can make your life much easier, especially in manipulating your Google Analytics data. Now, that we covered the most common expressions, let’s see where we can make use of RegEx.

1. Applying Table Filters.

This will probably be the most used scenario for using regular expressions in GA. It is also the fastest and the most effective way to work with specific data in a standard or custom GA report.

We will use the top right field from our Google Analytics table view.

Applying table filter Regex example

2. Setting Up Filters

Regular expressions are the go-to when comes to setting up advanced filters because is the only way to build and apply all the filters we need. As best practice, we would recommend using a testing view first before you apply any filters to your main GA view. Read more about advanced filters [SR1] 

3. Setting Up Google Analytics Goals

Another scenario for using RegEx in Google Analytics is whenever we are setting up goals. Currently GA offers four different types of goals; these are Destination, Duration, Pages/Screen per sessions and Event.

GA Goals Setup with Regex

For Destination type goal, regular expression can be handy, especially if we are looking for our thank you page which in most e-stores will contain orderID queries and other parameters.

4. Defining Funnel Steps

Using the above screenshot, you can define an optional funnel for your goal. As default Google Analytics will allow up to 20 steps for your funnel. However, in most cases, 2 or 3 steps will be the greatest number of steps used in an effective checkout process. Again, regular expressions are really handy when setting up funnel steps in GA goals.

5. Setting up Segments

A segment is a very powerful feature in Google Analytics and if we ever had to analyze any of our data is most likely we used a segment to filter out the “noise” from our data sample. RegEx is a must when we want to get an accurate segment set up.

6. Using RegEx in GTM

Google Tag Manager is the perfect platform for taking advantage of RegEx. We use regular expression in all kinds of places within GTM, from triggers to variables or even tags itself. RegEX in Google Tag Manager is similar to any programming language; we can get basic things done like an OR operation, but we can also use really complex instructions. In most cases, within GTM, we would use regular expressions for validation and matching.

Conclusion

In this beginner guide we have discovered what Regular Expressions are, how to use them and how valuable it is to learn about RegEx. We can use them to make more efficient segments, goals or custom filter in Google Analytics.

With more practice and a little bit more technical knowledge, we can use regular expressions to setup triggers, re/write variables and even use them in Tags within Google Tag Manager. In our next article, we are going to cover a more advanced approach for regular expressions, specifically targeting Google Tag Manager examples.

A useful way to test your instruction is to use a RegEx tester or debugger before you apply it to your goal, segment or filter. We would recommend using such a tool at Regex101.com or Analyticsmarket.com.