GDPR restrictions on the collection and use of personal data mean that many analytics users are asking themselves: Can I do useful analytics without personal data?
The simple answer is yes. Although you won’t be able to draw the same conclusions as with personal data, it’s not an exaggeration to say that anonymized data can provide you with useful insights into user behavior. We’ll show you how, but first…
Let’s talk a bit about privacy issues. What you need to remember is:
- GDPR sets a pretty high bar when it comes to data anonymization
- The regulation requires consent to process the data if it’s reasonably likely it could identify an individual
- Most unknown visitors don’t give this consent
If you want to use data about your visitors without collecting consents, you need to make sure that the data is REALLY anonymized (that won’t be the case with Google Analytics – here you can read why). Otherwise you expose yourself to severe fines etc. – you know the drill.
Fortunately, there are some tools on the market that will be up to the task. Analytics platforms featuring anonymous data collection, such as Piwik PRO, offer a middle way, instead of an all-or-none choice based on consent.
For example Piwik PRO uses the following anonymization process:
- Geolocation is disabled
- No “fingerprinting” data is saved about the visit
- Only a “Visitor ID” cookie is stored in the visitor’s browser (the cookie that has a lifetime of only 30 minutes, after which it’s deleted automatically by the browser)
Thanks to these settings your data will not fall under the category of personal data.
The Ultimate Guide to Data Anonymization in Analytics
With all the technicalities explained, it’s time to show you what anonymous data collection looks like in practice.
Let’s say your company wants to understand unknown visitor behavior on your site.
You’re especially interested in the following series of actions:
- a click on Learn more about a product
- a click on a contact information page and then
- a click on a Schedule a meeting form on the contact page
Fortunately, you’ll have no problem with getting this kind of information from anonymized data, since:
- The anonymous data can show, along with the data for those who gave consent, how the site performed based on a large sample of visitors. You’ll be able to track almost any action (number of visitors, page views, conversions and a time spent on the site) and in most cases credit them to a single visitor.
In other words, you’ll know that an anonymous visitor first performed action A, then B and then C during their session.
- What’s more, basic attribution is also possible. Your company will still see how visitors got to the site: through organic search, Google, an ad campaign, etc.
The anonymous data can only attribute actions to a single visitor across a single session though. That means you won’t be able to determine if any of those actions were performed by a returning visitor by that can’t create a history of actions across multiple sessions.
The lack of persistent tracking means it can’t be used for:
- long-term personalization
- attribution of conversions to actions taken over several visits
However, there is the possibility to recover broader data if a visitor gives consent after initially declining.
Let’s say a visitor ignores the consent dialog at first, browses for 15 minutes and in the end responds to a prompt to provide data for personalization. The data from those 15 minutes of browsing can be added to a record, now under a cookie with a longer lifetime (12 months in the case of Piwik PRO).
Data that was initially completely anonymous can be used for personalization in later sessions because consent was given during the active session.
Learn how to recognize PII and personal data to stay away from privacy issuesDownload your copy now
Since this example of anonymous data collection relies on a cookie, it’s worth clarifying two issues.
First, not all cookies are personal data, even though many are. The cookie used by Piwik PRO for anonymous data collection is not personal data because it’s deleted automatically after 30 minutes. The visitor ID from this temporary cookie is used to classify visitor data (also not personal data because of the masking of fingerprinting and geolocation fields).
It’s highly unlikely that anyone could reconstruct the identity of an individual based solely on their browsing actions on a single website.
Second, the kind of first-party cookie mentioned here is rarely blocked. Browsers and ad blocking software block more and more persistent cookies generated by third parties. When you read about the end of the browser cookie, they are talking about these persistent third-party cookies.
Although anonymous data collection is still rare in web marketing and analytics, it has a long and productive history in other fields. Whole books have been written about anonymizing health data, for example.
To respond to data privacy demands of internet users the world over, those employing web analytics will have to adapt some of these methods. Anonymous data collection of the kind found in Piwik PRO is a first step in this direction.