Sometimes, you may have a naming convention in one or more native dimensions that has a clear pattern everywhere it is used. In such cases, it's common that specific pieces of information are defined in the same "part" of the dimension values. It could, for example, look something likes this:

Market | Brand | Campaign type | Campaign name | Product 

Now, all of these "positions" can have several different potential cases. You could be marketing two or three different Brands in five to seven different markets and target ten to twenty different products at any given time. Instead of creating as many rules in your Market, Brand or Product custom dimensions as you have cases for each placeholder, you could programmatically create a dimension value for every potential case.

Let's say that the above naming convention is something we apply to all our native Campaign dimension across all our marketing platforms. I would then probably like to know how my various campaigns perform across all of them! What I can do is create a regular expression dimension rule that always picks up whatever can be found before the third and fourth | character, ie. the different values of the Campaign name placeholder.

The regular expression for that would look something like this (you can follow this link to test it out and modify it):

^(?:[^|]*\|){3}\s([^|]*)\s 

Notes

What follows is an attempt to break down some of the concepts used in this regex for those who are curious to understand how it's constructed in more detail:

  • ^ denotes the start of a string
  • (?:...) will "match everything enclosed", so whatever pattern of characters is put in place of ... will be matched repeatedly until told otherwise.
  • [^|]* matches any amount of characters that are not a |. The * is the operator that means "any amount of".
  • \| matches the specific character | (which, by itself, is an OR operator in regex, hence the backward slash that "escapes" the original function).
  • {3} instructs our the pattern described in the parantheses preceding it to be matched three times.
  • \s denotes a whitespace, which we place outside of our capture group in order to not include it in the dimension values that we generate.
  • (...) is a capture group, where the ...  can be replaced with any sequence that will match the characters to be captured.
  • Hence, ([^|]*) is a capture group that captures any amount of characters that are not a |
Did this answer your question?