Back to blog

4 burning questions about onboarding personal data and personally identifiable information (PII) to your analytics platform

Data privacy & security

Written by

Published February 1, 2019 · Updated May 15, 2023

4 burning questions about onboarding personal data and personally identifiable information (PII) to your analytics platform

If your organization operates within the digital ecosystem, whether it’s the banking, telecommunication, or healthcare industry, chances are you’ve got significant untapped potential laying dormant.

This potential is your customers’ data and sensitive information. Most probably it’s trapped in silos of different departments, which is hindering your marketing team from getting the most out of targeting campaigns.

Fortunately, all is not lost. You already have the data at your fingertips, because you asked your visitors, prospects, and customers for consent to share it with you. What you should consider is onboarding this data to your analytics platform, so you can juice up your cross-channel marketing initiatives.

This notion is slowly making its way into digital advertising realm, so people are asking lots of questions about it. In today’s post we’ll address the most important ones. You’ll know what challenges you can expect, but most importantly what benefits you could gain. Here we go!

1. What is data onboarding?

Before we move on to the core of this post, let’s get a few basic things straight. Firstly, let’s clarify what data onboarding is.

Data onboarding is a marketing process of bringing all offline data into the digital realm. Marketers connect all the offline information on their customers with data that comes from online sources, unlocking the data from silos to create a 360-degree view of customers and improve targeting.

This is complex process that we can divide into a couple of steps:

  1. Collecting user identifiers during the customer journey on the website or mobile app. This could be gathering forms data, scrapping website data with a tag manager, or relying on data passed through a data layer.
  2. Importing offline data from CRM, marketing automation software, transactional systems and other sources.
  3. Matching online behaviour data with imported data using the user identifiers and building a Single Customer View containing both online & offline data.

From the technical standpoint, this means that every piece of data you gather from your CRM, tag management system, and analytics platform will be synced and ready to use through a customer data platform. Although this isn’t the only way to do it (you could employ a different technical solution), we recommend it as the most efficient and optimal one.

Once the identifying information from offline is merged with online sources, advertisers can increase the relevance and effectiveness of their ads and other content across all marketing channels.

Secondly, we’re not talking about onboarding just any kind information, but personal and sensitive information. The scope of this information depends, for instance, on legislation and regulations. Essentially, it’s critical data that requires special protection from unauthorized access so that the privacy and security of people and companies is properly guarded and protected.

On the other hand, sensitive information on a business includes any information that can pose a risk to the organization if accessed by competitors or made public. We’ll dwell a bit more on these definitions and the implications of sensitive data later on.

2. What can you achieve with onboarding sensitive information to your analytics tool?

The right data at hand means that you can take your marketing initiatives to the next level.

As reported by CMO, 53% of marketing specialists acknowledge the importance of customer-centric communications.

And that calls for data, precise information on your prospects and clients.

Marketers who apply onboarding personal information to their analytics platform can:

  • enhance targeting and customer experience
  • better optimize sites for their audiences
  • deliver perfect segmentation and personalization
  • precisely craft your audiences
  • increase marketing reach across digital properties
  • better respond to data subject requests under the GDPR

3. What are the benefits of onboarding personal information to your analytics platform?

Virtually every business with an online presence, be it banks, telecoms, healthcare organizations, or e-commerce, recognizes the importance of creating a compelling 360-degree customer journey.

This is a very complex and intricate process, and marketers and analysts face many bumps along this road. However, the right solution will not only deliver significant benefits, but most importantly it will make the road much smoother.

As revealed in research by IBM & E-consultancy, organizations stress that having the right technologies makes the greatest impact on their success in using data to understand customers.

If you want to know how Piwik PRO can help you with this task, then let’s dive in.

Time & cost savings

First of all, you can be sure of the cost and time efficiency that accompanies this solution. You don’t need to spend your whole budget on multiple tools and licences, like Google Analytics 360 or Adobe Marketing Cloud to do your tracking, or Google Tag Manager or Tealium for your tag management, and yet another customer data platform. If you operate within the banking and finance industry, you might even venture into building your own platform, making things even more complex.

Here you get an all-in-one solution to collect data from multiple sources, check server logs, analyze your findings, and finally put the data into action. What’s more, you’ll save both time and money on Business Intelligence teams.

All in all, this means you can forget about the mundane handwork of extracting data from API and feeding your analytics platform. With Piwik PRO, you can automate the relevant process and easily navigate through one comprehensive analytics suite. And you won’t need to rely on costly BI specialists because your marketing team will be able to handle all analyses.

100% data ownership

One of the crucial benefits you’ll obtain from implementing Piwik PRO software is the 100% guarantee that you remain the sole owner of all the data you collect. Not only are you responsible for it, but you have complete control over it.

Ownership in this context means access. You can get the data you need at any time, and you also set rules about who is granted responsibility for data so it doesn’t get into unauthorized hands. The data being in your hands grants you access to performance metrics, advertising spending, and all the other information you need to make your campaigns thrive. Moreover, you get data transparency and can check the overall performance of your ad and content strategies independently without relying on third-party agencies.

However, when it comes to other vendors, things aren’t so easy. For instance, if you use Google Analytics (GA 360) you need to carefully negotiate the terms with Google – otherwise you’re just a licensee of the data you collect. As you can read directly in their policy:

Google and its wholly owned subsidiaries may retain and use, subject to the terms of its privacy policy (located at, information collected in Your use of the Service.

Google will not share Your Customer Data or any Third Party’s Customer Data with any third parties unless Google (i) has Your consent for any Customer Data or any Third Party’s consent for the Third Party’s Customer Data; (ii) concludes that it is required by law or has a good faith belief that access, preservation or disclosure of Customer Data is reasonably necessary to protect the rights, property or safety of Google, its users or the public;

(…)or provides Customer Data in certain limited circumstances to third parties to carry out tasks on Google’s behalf (e.g., billing or data storage)(…)

That’s why it’s vital to check the vendor and their contract terms before you sign up for any plan.

Free Comparison of 4 Enterprise-Ready Customer Data Platforms

Get to know 25 key differences between Tealium, Ensighten, BlueVenn and Piwik PRO to find out which platform fits your business’s needs

Download FREE Comparison

Complete customer journey tracking

Gone are the days when providing excellent customer experience was limited to single interactions. No matter whether you operate in banking, telecom, or healthcare, a rule of thumb is to focus on the customer’s entire journey.

McKinsey has conducted thorough research and concluded that companies that perform best on journeys have a more distinct competitive advantage than those that excel at touchpoints.

The trouble is that only some vendors can handle giving you comprehensive support from A to Z. However, Piwik PRO takes this burden off your shoulders. You don’t need to stitch analytics data with server log exports or any other solutions developed in-house for use in secure member areas such as transaction system sites. You can precisely map the customer journey across all channels and follow user flow without missing any significant data points.

What’s more, Piwik PRO lets you track users’ journeys after they have logged into their account on your website. You can use a tracking code in post-login areas since you have full control of the application. You know where the data goes – into your own servers within secure perimeters.

However important it is to get a thorough understanding of your customers, getting the complete picture of their journey should never compromise their privacy and security. That’s why we stress the significance of data control and choosing reliable solutions. Also, you should consider implementing some extra security measures such as Single-Sign-On, data encryption, or a Change Log to make sure the data is safe and sound.

Issues involving the GDPR, consent and other requirements regarding personal data are too complex to explain in a few words here. Fortunately for you, we’ve covered them in detail in the following posts:

Higher data granularity

To get precise user profiles, advanced segments, and to serve visitors tailored offers, you should look beyond their name, address, birthday or internet browser. Make sure your marketing arsenal has everything it takes to get more granular data than just from server logs.

With Piwik PRO CDP you can integrate data from the systems both within your security perimeter or beyond it, such as:

  • CRM, marketing automation software
  • e-commerce
  • banking transactional systems
  • healthcare platforms and portals
  • lead capture forms
  • social media platforms
  • other systems via API or through CSV imports

The more details you have, whether it’s customers’ personal data or their actions across digital channels – likes, scrolls, clicks, subpage visits – the more precise profiles you can create.

Here’s a very simple view of one of the profiles that you can build, which a CDP helps you automatically update as the user clicks further through the site.

Moreover, Piwik PRO self-hosted infrastructure allows you to onboard all this data to your analytics instance and then act on it to empower your marketing initiatives.

Eliminate data silos and create a Single Source of Truth

For many organizations, especially those storing a wealth of data like banks, healthcare or telecoms, the real struggle is that their data sits in silos. This means that data is scattered across HR, sales, marketing, finance and so on. It results in some teams having only limited data at their disposal, so they might be missing some crucial bits.

But you don’t have to follow in their footsteps. You can opt for a vendor which lets you connect all the collections of data scattered across different departments and software. That’s what you can expect from Piwik PRO. Not only will you get rid of the silos, but you’ll be able to build a single source of truth which ensures that every single bit of data is stored only once.

So, why is keeping the same data in various spaces a no-no? It’s inefficient to have the same data twice, as it risks exposure to human error. If a system holds the exact same data in many places, updates will run in multiple places. This leads to you accumulating a surplus of data which requires more storage and more work, for no good purpose.

Applying the single source of truth practice helps organizations evade the possible risk of retrieving outdated, and consequently incorrect, information. No wonder that it’s gaining ground, especially in an enterprise environment where authentic, relevant, and correct data is the fuel.

4. What do you need to take into account when onboarding privacy-sensitive data?

Data sovereignty and residency

We’ve already mentioned a serious issue concerning data – its ownership – and now it’s time to consider data sovereignty and residence. These are two closely related principles you need to be aware of if your business relies on data.

What’s behind the idea is that data is subject to the laws and governance structures of the country where it’s located. Sounds simple. But when taking into account data transfer, cloud storage, and new approaches like object storage, things get confusing.

Cloud service providers store data all over the globe and move it from one data center to another, whether because of cost or redundancy. This causes multiple issues around data sovereignty and residency.

Numerous countries, like Germany, Australia, Canada, India, and Switzerland, have introduced laws that order them to store data only within their physical country borders. Such practices are called data protectionism. Generally, data localization policies are directed at certain types of data. And different regulations might apply to different types of data.

Regulatory requirements can go in different directions. That being said, US legislation extends the ability of the federal government and law enforcement agencies to get access to communications and emails saved in the cloud in certain circumstances.

However, storing data is one thing, but data transfer needs a difference approach. For instance, the GDPR allows it only if an adequate level of security is guaranteed. 

Update: As of July 16th 2020, Privacy Shield is no longer a valid legal framework for transferring data from the EU and Switzerland to the US. The situation is evolving fast, though. Here we’ve written about the decision and will provide updates when anything changes. And here we’ve written about how such limitations affect users of Google Analytics.

Financial and healthcare organizations need to stay particularly alert as they hold large volumes of sensitive information. Laws governing the handling of that data often overlap and, worse, are constantly changing.

As data residency and localization laws are cropping up, changing the legal landscape, companies are seeking vendors that help them keep up with these rapid changes and provide support in overcoming compliance challenges. For instance, if you want to stay in line with the GDPR, the best idea is to partner with a vendor that allows you to store data on your own servers within the EU’s borders.

Personal data, sensitive data, and PII: important definitions and differences

When discussing onboarding personal information, you need to consider the distinctions between PII and personal data, but also sensitive data and information. This is a crucial legal issue since it dictates how to approach, store, and ultimately process it.

The confusion around these terms is mostly related to legislative issues, territorial scope, and also nomenclature. Here are some key considerations

Firstly, PII stands for Personally Identifiable Information, which is an American legal term for data which may lead to identification of an individual. This means any information that identifies, links, or relates to a person:

  • full name
  • maiden name
  • social security
  • driver’s license
  • bank account
  • credit card
  • address

Even financial and medical data falls under the definition.

As to sensitive PII, according to the United States Department of Homeland Security, this means:

Personally Identifiable Information, which if lost, compromised, or disclosed without authorization, could result in substantial harm, embarrassment, inconvenience, or unfairness to an individual. Sensitive PII requires stricter handling guidelines because of the increased risk to an individual if the data are compromised.

Since this is a complex matter, we recommend familiarizing yourself with a thorough document provided by the Department of Homeland Security:
Handbook for Safeguarding Sensitive PII.

Note that sensitive information is a broad concept and comprises various types of data. While we encourage you to onboard such data, some categories are harder than others. Financial data is much more forgiving than data about health for example. We urge caution in all cases. Knowledge of local regulations about such data is a must.

We’ve got also Non-PII on the landscape. This is info you can’t use on its own to identify or trace a person. The most common examples would be cookies, IP addresses, and device IDs.

On the other end of the spectrum is the GDPR with its definition of personal data:

Any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.
Article 4(1).

Personal data extends the definition of PII to include even pieces of information like transaction history or posts on social media.

As to sensitive data, under the Regulation it constitutes a special category of personal data that includes information like:

  • ethnic origin
  • political views
  • trade-union memberships
  • genetic data, biometric data
  • data concerning health
  • a person’s sex life or sexual orientation

Details on the definition and processing this data are in Article 4(13), (14) and (15) and Article 9 and Recitals (51) to (56) of the GDPR.

We realize that these are complex issues, so to dive deeper and get a bigger picture you should check out our other publications:


Knowing all the background and the tiny differences between the various definitions of data is of key importance if you want to onboard such data to your analytics platform.

First and foremost, Google doesn’t want you to do it. In their policy you’ll find a clear statement saying that You will not and will not assist or permit any third party to, pass information to Google that Google could use or recognise as personally identifiable information. You can find more information about this matter on their support site.

Then there’s Adobe, which recommends avoiding passing PII to your analytics instance, but actually you can’t be sure if it gets there or not.

As you can see, there are many things you need to consider, but you can find a vendor who can help you without putting you at risk. For example, if you’re the owner of the data, you have full control over it, and you know exactly where it’s stored, you can onboard personal information to your analytics tool and get the most out of your marketing initiatives. That’s a scenario possible thanks to Piwik PRO’s hosting offer.

The on-premises option allows you to store data in your company’s cloud subscription with one of our certified cloud providers. You remain the sole owner of the subscription. 

Another approach involves deployment in a private cloud, where you can choose one of two options. One is a dedicated database that shares server resources between clients but keeps analytics data separate. Compared to the private cloud (dedicated hardware), this option saves money and resources while assuring enhanced security for the most important data.

The other option is dedicated hardware that keeps all server resources and analytics data separate. This ensures full separation of servers used in the application to capture and store data and generate reports and metadata but increases costs and implementation time.

All three of these options give you a choice of where you host the data – in 60+ Azure regions, and European-owned Orange Cloud in France and Elastx in Sweden.

Free Comparison of 4 Enterprise-Ready Customer Data Platforms

Get to know 25 key differences between Tealium, Ensighten, BlueVenn and Piwik PRO to find out which platform fits your business’s needs

Download FREE Comparison

Expansion of cloud solutions

The proliferation of cloud computing platforms and applications in enterprises is a fact. The value of these solution has grown from $58 billion in 2013 to $130 billion in 2018. And forecasts are promising, the sector should reach a value of $160 billion by 2020.

Source: Statista

However, for banks and other financial institutions, switching to a cloud-based solution might not be a priority. That’s because they’re more susceptible to shifting political winds. Also, their internal policies and industry regulations are much more stringent. On top of this are rigid privacy laws, with GDPR taking the lead.

In this case, banking and finance organizations can stay compliant with multiple regulations – knowing exactly where your data is and who has access to it.

Activating data for marketing campaigns

Marketers who want to create campaigns on third-party platforms, like DSPs, must keep in mind one other thing. You can build precise segments based on personal data as long as you’re in control of that data and you do it in a secure environment.

It means that you acquire users’ consent for processing their data, and then you create custom segments on your customer data platform without sharing sensitive information with third-party vendors.

Final thoughts

As you can see, onboarding personal and sensitive information to your analytics platform involves a great number of processes and considerations. The key is to have a reliable vendor that helps you with this task by providing you with full control of the data and a secure environment. Add to that the vital functionalities allowing you to act on data and create an effective strategy to grow your brand based on a solid understanding of the customer journey.

We’ve covered the most significant issues, but for sure you’ve still got some questions, so don’t wait and reach out to our team to get all the information you need.

Contact us


Karolina Matuszewska

Senior Content Marketer

Writer and content marketer. Transforms technical jargon into engaging and informative articles.

See more posts by this author

Core – a new plan for Piwik PRO Analytics Suite

Privacy-compliant analytics, built-in consent management and EU hosting. For free.

Sign up for free