Google Analytics – Avoiding personally identifiable information (PII)

Those of us who handle data understand the challenges related to data sensitivity. As much as we would like to collect data to personalize our user’s experiences there are some restrictions and rightly so. In google analytics collecting any PII data could result in account termination and data deletion.

Let us start with what exactly comes under PII data. The obvious ones are E-mail IDs, phone numbers & credit card details. Additionally, depending on your countries laws what counts as PII could vary. Under GDPR names, addresses, financial information, login IDs, biometric identifiers, geographic location data, customer loyalties history fall under PII data. Read more about GDPR here.

You may not be intentionally collecting PII data, but sensitive information entered by website visitors could be collected by your analytics for whatever reason. Usually, the default page tag that collects the URL and page title has the potential to collect sensitive information.

Now that we have discussed what & why let us move into how we can resolve PII issues in your google analytics set up. The most common case with google analytics users is E-mail ID being captured in their URLs and that is what is being addressed.

Method I – Simo’s way: Little complicated, but works.

Step 1: Create a JavaScript variable within your google tag manager and enter the following code

function() {
return function(model) {
// Add the PII patterns into this array as objects
var piiRegex = [{
name: 'EMAIL',
regex: /.{4}@.{4}/g // you can add multiple regex for name or mobile number or SSN if needed.
}];
var globalSendTaskName = '_' + model.get('trackingId') + '_sendHitTask';
// Fetch reference to the original sendHitTask
var originalSendTask = window[globalSendTaskName] = window[globalSendTaskName] || model.get('sendHitTask');
var i, hitPayload, parts, val;
// Overwrite sendHitTask with PII purger
model.set('sendHitTask', function(sendModel) {
hitPayload = sendModel.get('hitPayload').split('&');
for (i = 0; i < hitPayload.length; i++) {
parts = hitPayload[i].split('=');
// Double-decode, to account for web server encode + analytics.js encode
try {
val = decodeURIComponent(decodeURIComponent(parts[1]));
} catch(e) {
val = decodeURIComponent(parts[1]);
}
piiRegex.forEach(function(pii) {
val = val.replace(pii.regex, '[REDACTED ' + pii.name + ']');
});
parts[1] = encodeURIComponent(val);
hitPayload[i] = parts.join('=');
}
sendModel.set('hitPayload', hitPayload.join('&'), true);
originalSendTask(sendModel);
});
};
}

Step 2:

  1. Navigate to your ‘All pageview tag’
  2. More settings
  3. Field to set
  4. Add a field name (eg: customTask)

Assign the value as the JavaScript variable you had created in the first step

Step 3: 

Repeat step 2 for all tags that send page URL data to google analytics.

Method II – Quick Solution

Navigate to the tags which send URL data to Google Analytics & create a trigger where Page URL does not contain the ‘@’ symbol

The pro of this solution is that it is quicker but the con would be that it stops the tag from firing at all as opposed to the first method where the tag fires, but the e-mail is hidden.

If there’s something that has not been clarified in this article please e-mail us at nimeshchaturvedi1992@gmail.com

You may also like...

Leave a Reply