Analytics

La variable de table RegEx dans Google Tag Manager

Depuis l’introduction de la variable Lookup Table dans Google Tag Manager, les utilisateurs en redemandent. La table de recherche fait exactement ce qu’elle promet : des recherches. Ce sont des opérations de correspondance exacte, qui sont extrêmement peu coûteuses à effectuer, car elles ne peuvent avoir qu’un résultat binaire : soit la correspondance existe dans le magasin de données interrogé, soit elle n’existe pas. Ces performances restent constantes même si la taille du magasin de données interrogé augmente. Cependant, la correspondance exacte présente un problème important : c’est exact correspondre. Ainsi, même si la variable Lookup Table est extrêmement utile, il lui manque la flexibilité de, je ne sais pas, disons, les expressions régulières. Vous serez donc ravi d’apprendre que Google Tag Manager a publié un nouveau type de variable : le Tableau RegEx!

Variable de tableau RegEx

Tout d’abord, si vous n’êtes pas familier avec les expressions régulières, voici quelques ressources intéressantes :

  • Tutoriels interactifs par RegexOne

  • Guide extrêmement complet de regular-expressions.info

  • Livre électronique sur les expressions régulières de Bounteous pour Google Analytics

  • Aide-mémoire sur les expressions régulières de Google Analytics par Jay Taylor

Inutile de dire que RegEx est une puissante syntaxe de correspondance de modèles à apprendre et peut vous aider énormément à garder votre conteneur Google Tag Manager léger et efficace.

La variable de tableau RegEx

Vous trouverez la variable RegEx Table dans la liste des types de variables que vous pouvez créer en tant que Variable définie par l’utilisateur. Une fois que vous aurez choisi ce type de variable, vous verrez la configuration suivante :

Tableau RegEx

Il existe de nombreuses similitudes avec la variable Lookup Table, pour une bonne raison, mais il y a aussi un tas de paramètres qui transforment cette nouvelle variable en une force formidable à part entière.

1. Variable d’entrée

Les Variable d’entrée partage ses fonctionnalités avec la table de consultation. La variable d’entrée est ce que vous allez faire vos vérifications de modèle contre. Par exemple, si vous souhaitez utiliser la table RegEx pour rechercher des modèles dans le chemin de la page actuelle, vous choisirez la {{Chemin de la page}} variable comme entrée.

Variable d'entrée dans la table regex

La variable d’entrée est évaluée ligne par ligne, de haut en bas, par rapport à chaque modèle. Lorsqu’un motif correspond, la sortie correspondante est renvoyée et le traitement de la table s’arrête.

2. Tableau RegEx

Ensuite, vous avez la table elle-même. Dans le tableau, vous ajoutez des lignes, où chaque ligne représente un motif vous voulez faire correspondre dans l’entrée, et un sortir renvoyé par la variable si le modèle correspond.

Les motif est toujours interprété comme une expression régulière. Tous les modèles suivants sont des exemples valides :

  • simoahava.com – correspondra à “simoahava <+ n'importe quel caractère +> com”

  • simoahava\.com – correspondra à “simoahava.com”.

  • ^simoahava\.com$ – correspondra exactement à “simoahava.com” (n’autorisera pas les caractères de début ou de fin).

  • (simoahava)\.com – correspondra à “simoahava.com” et créera un groupe (voir ci-dessous) de “simoahava”.

Un motif comme [simoahava\.com is not a valid regular expression, because “[” is a reserved character, and it is being incorrectly used in this pattern. Google Tag Manager will not warn you of errors in the regular expression, but you’ll know something is wrong if the Preview mode output for the variable is boolean false. Conversely, if no match is made or there is no output for a matched pattern, the variable will return undefined.

The output is what the variable returns when a row is matched against the input. The return type is a string, unless you add another variable into the output. This is a great way to chain RegEx Table variables, just as you could chain Lookup Table variables.

For example, here’s a simple chain of a RegEx Table and a Lookup Table:

Regex Table Lookup Table Variable Chain

And here’s how to unravel the process:

  1. If the page hostname matches the pattern beta\.simoahava\.com, then return “UA-12345-1”.

  2. If the page hostname doesn’t match either beta\.simoahava\.com or \.simoahava\.com, also return “UA-12345-1” (Default Value of the RegEx table).

  3. If the page hostname matches \.simoahava\.com and the user is in Debug Mode, return “UA-12345-2”.

  4. If the page hostname matches \.simoahava\.com and the user is not in Debug Mode, return “UA-12345-3”.

As you can see, the RegEx Table returns the first match that is made. Thus even though beta\.simoahava\.com and \.simoahava\.com overlap for any hostname that contains the string “beta.simoahava.com”, the RegEx table returns “UA-12345-1”, because that is the first match that the variable makes.

3. Set Default Value

As with Lookup Tables, you can set a Default Value that is always returned in case no match is made. Just like pattern outputs, this can be another Google Tag Manager variable.

4. Ignore Case

If you check Ignore Case, patterns are matched regardless of case. So a pattern with WwW\.SiMOAHava\.com will match against the domain of my site, as long as Ignore Case is checked.

Ignore Case is checked by default.

5. Full Matches Only

If you check Full Matches Only, then all patterns must match the entire input. This is the equivalent of wrapping each individual pattern with ^...$.

For example, if you have Full Matches Only checked, and you have a pattern of www\.simoahava\.com, then the input variable must return exactly www.simoahava.com, without any other characters. If you’d have the setting unchecked, then www\.simoahava\.com would also match any of the following:

  • greatest.website.ever.is.www.simoahava.com

  • aawwwwww.simoahava.com

  • visit.www.simoahava.com.please

And so forth.

Full Matches Only is checked by default.

6. Enable Capture Groups and Replace Functionality

This is interesting! In addition to matching the input against a pattern and returning a corresponding output, you can actually use parts of the matched pattern within the returned output. This is achieved with capturing groups and the dollar symbol syntax.

A group (capturing and non-capturing) in RegEx is a pattern that you define with parentheses. Most groups can then be captured using the dollar symbol syntax when using the String.replace() method or, consequently, the Enable Capture Groups and Replace Functionality feature of GTM’s RegEx table. Here are the options for the dollar symbol syntax:

  • $$ inserts a ‘$’.

  • $& inserts the matched pattern.

  • $` inserts whatever precedes the matched pattern in the string.

  • $' inserts whatever follows the matched pattern in the string.

  • $n inserts the _n_th capturing group.

These all have their uses, but the last one, $n should prove to be the most useful. You can use it to normalize patterns across a range of values. For example, let’s say you have a variable which stores the user’s phone number in the following formats:

  1. 358101001000

  2. 0101001000

  3. 010-1001000

  4. 010 100 1000

  5. +358101001000

You want to normalize all of these to the last format (+358101001000) whenever the phone number variable is used. This is how you’d configure the RegEx table:

Enable Capture Groups and REplace Functionality

The first pattern looks for strings that start with ‘358’ followed by any numbers. This pattern is simply replaced with the plus symbol followed by the pattern itself.

The second pattern looks for a string of numbers preceded by a ‘0’. The output is ‘+358’ and the string of numbers, omitting the leading ‘0’.

The third pattern looks for a string of numbers preceded by a ‘0’, then a hyphen, and then a string of numbers again. The output is ‘+358’ and then the two strings of numbers, omitting the leading ‘0’ and the hyphen.

The fourth pattern looks for a string of numbers preceded by a ‘0’, then a space, then another string of numbers, a space, and finally one more group of numbers. The output is ‘+358’ and the three groups of numbers, omitting the leading ‘0’ and the spaces.

The final pattern checks if the phone number is already well-formed, returning the pattern itself if this is the case.

Using the RegEx Table like this, we can create simple string transformations which help normalize and clean up data across a variety of formats. As you can see, Full Matches Only is checked in this example. That means we don’t have to worry about anything that happens outside the matched pattern, since only full matches to the pattern are transformed.

If you leave Full Matches Only unchecked, then Enable Capture Groups and Replace Functionality will replace all matches of the pattern found within the Input Variable with what you have in the Output. For example, if you have a RegEx Table variable that looks like this:

Replace pattern match with output

Then whenever the string “analytics” is found within a page path, it will be replaced with “google-analytics”.

Here is an example:

/analytics/track-users-who-are-offline-in-google-analytics/ becomes /google-analytics/track-users-who-are-offline-in-google-google-analytics/.

Note that the example above only works if Full Matches Only is unchecked. Otherwise the variable would only replace page paths which are exactly analytics, and page paths like that do not exist.

The little help bubble actually recommends to avoid combining this pattern replacement with unchecked Full Matches Only. This is because there’s no validation of the input variable, and you might end up replacing things that you didn’t mean to!

Enable Capture Groups and Replace Functionality is checked by default.

Summary

That’s the RegEx Table in all its simple glory! I know it will make some operations so much simpler. You no longer need to use clumsy Custom JavaScript variables to perform your pattern matches, since the RegEx Table has that built into its modus operandi.

The option to replace any matches with custom strings (in which you can incorporate parts of the match using groups) is pretty powerful, too.

All in all, this is a very welcome addition to Google Tag Manager’s variable offering. It remains to be seen if the Lookup Table still has a place in the table after this, because with the RegEx table you can do exact match lookups, too. The difference is perhaps in the syntax (with Lookup Tables you don’t need to use regular expressions) and performance (lookups will always perform much faster than regular expression matches), though the latter might be very insignificant in the context of a web page.

Source : www.simoahava.com

Articles similaires

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Bouton retour en haut de la page
Index