How to use regex matches

Examples how to use regular expression to find matching rows in your data or extract and capture specific parts or patterns.

Peter Lundberg avatar
Written by Peter Lundberg
Updated over a week ago

Regular expressions (regex) are a powerful tool when working with Working with dimension rule results. There are plenty of tutorials and references available on internet, but below are some common scenarios and how to solve them assuming basic knowledge of regex.

More powerful pattern matching

Contains specific pattern

When exactly matches -[0-9]{3}$
=> Ends with three numbers after a dash, e.g. "Brand-874"

When exactly matches \$..-....-...\$
=> Between literal $ characters a id matching, e.g. "Remarketing $AX-1345-BE$" 

Shorthand "or"

When exactly matches cpm|cpc|cnv
=> Contains "cpm" or "cpc" or "cnv".

This is more concise and faster that the equivalent regex .*(cpm|cpc|cnv).* or ^.*(cpm|cpc|cnv).*$. To require exactly (instead of contains)  ^(cpm|cpc|cnv)$ is needed instead. Another alternative is c(pm|pc|nv) would have the same effect but it not as readable.

Extracting / Capturing parts of a value

Funnel can use capture groups - the expressions in parentheses "()" - and use it in further handling

Extract query parameter

When exactly matches utm_campaign=([^&]+), Then is regex match capture group 1
=> Extract non empty utm_campaign parameter from an url 

So for instance the value "https://my.site.com/shop?utm_campaign=MyCampaign&utm_source=mail" will capture "MyCampaign" in the first group.

Extract exact parts with separator

When exactly matches ^([^_]+)_([^_]+)_([^_]+$))
=> Extract 3 parts with a _  separator and exactly those 3 parts

Note! You can also use "Split by" for simpler cases of using a delimiter (see How to use Text Splitting).

E.g. "{Ad Category}_{Campaign Type}_{Target}_perhaps-other-details" with "Then regex match capture group 2" will result in "{Campaign Type}".  [^_]  means not the literal separator character "_"  and +  is one or many times. Note that the simpler  ^(.*)_(.*)_(.*) will not be as fast and is not as specific as the last group will capture all remaining characters.

Did this answer your question?