Collecting and analyzing user data is essential to healthcare businesses that want to build relationships with prospects, better meet their patients’ needs, and gain authority within the industry.
As a healthcare organization subject to HIPAA, you’re walking a fine line when trying to improve the patient experience and ensure your activities are HIPAA-compliant.
Vendors have been adjusting to the shifting privacy-oriented analytics landscape and their clients’ expectations. Many of them change their offers accordingly. At the same time, the dominant analytics vendors are not necessarily the most compliant options for healthcare providers.
In this article, we will show you the analytics vendors and implementations available on the market and explore their advantages and shortcomings concerning HIPAA compliance.
Due to HIPAA’s strict privacy and security regulations, you need to evaluate the compliance of every analytics tool in your marketing stack. But the deeper you dig into what you need to do, the harder it might be to make sense of it.
The stakes are high – breaches of HIPAA may result in heavy penalties and damage to your brand’s reputation. An example is a recent lawsuit against the University of Iowa Hospitals & Clinics, alleging that the healthcare organization unlawfully disclosed personal information to Facebook via the use of tracking pixels.
When it comes to web analytics platforms and HIPAA, your approach depends on whether you collect protected health information (PHI) through your site or app. Data that isn’t considered PHI is outside the scope of HIPAA.
First, sharing PHI for marketing and analytics is not a permitted disclosure under the HIPAA Privacy Rule. To legally send PHI to your analytics platform, you must sign a business associate agreement (BAA) with the vendor, specifying each party’s responsibilities regarding PHI and ePHI and establishing a legally binding relationship.
Many vendors don’t want to sign BAAs. In this case, you must remove all identifiers from the data to use their services, so that it’s no longer considered PHI. But the process of de-identification is long and complicated.
For one thing, HIPAA views many types of URLs as PHI. It would be hard to de-identify all URLs, and doing so would make your analytics unusable. For example, de-identification would negatively impact remarketing and user-based or service-based reporting.
On the other hand, cherry-picking URLs containing PHI would also be difficult, mainly because of how much sites change over time.
The sunset of Universal Analytics further complicates the situation – switching to GA4 entails high costs and commitment of resources, and it requires your team to learn how to use a completely new tool. But GA4 won’t be the optimal solution for an organization seeking to align its operations with HIPAA.
Google’s forced migration from UA to GA4 is an excellent opportunity to evaluate your analytics setup. If you need to comply with HIPAA, you have to look around and explore the available options.
The guide to HIPAA compliance in analytics
Learn how your organization can achieve HIPAA compliance in analytics, marketing and advertising, including recommended practices and tools to adopt in your technology stack.
Google Analytics continues to be the most widely used enterprise analytics platform.
There are several ways to implement Google Analytics (GA4). Let’s look into them and analyze how HIPAA-compliant they are.
This setup is not HIPAA-compliant.
First, the standard implementation involves combining GA4 with client-side Google Tag Manager. However, this option is far from being HIPAA-compliant.
Organizations covered by HIPAA can’t disclose PHI to tracking technology vendors – this includes sharing and using PHI for marketing purposes. Google uses all data within its systems to develop new services, improve existing offerings, and create personalized advertising experiences. Using a covered entity’s PHI for Google’s scale of operations can be a severe violation of HIPAA’s Privacy Rule.
Google also stores all tracked data in databases located around the world and offers neither on-premise hosting nor bespoke data residency services. Thus, covered entities cannot control where their patient data is stored. HIPAA sees this as a breach of accountability.
Plus, Google clearly states in its documentation that it doesn’t want you to keep PHI in Google Analytics:
“Customers who are subject to HIPAA must not use Google Analytics in any way that implicates Google’s access to, or collection of, PHI, and may only use Google Analytics on pages that are not HIPAA-covered.”
You must make an extra effort to avoid passing any trace of PHI to your analytics or switch to an analytics platform that will help you process patient data with the proper safeguards.
When using client-side GTM, the user’s browser communicates directly with third parties, making it challenging to control the shared information. Depending on how your website or app processes user information, there might be a risk of PHI being shared in HTTP requests.
At this point, it’s worth clarifying that not all health information acquired by organizations is considered PHI. For example, in most cases, phone numbers, email addresses, or social security numbers alone are not PHI. But if this data is connected with details about a health condition, treatment plan, or other particular health information, it would transform from PII into PHI.
The recent HHS bulletin elaborates on when data may qualify as PHI. For example, it states that personally identifiable information (PII), such as an email or IP address, is considered PHI, even if it doesn’t include specific treatment or billing information. When a covered entity, such as a health clinic, collects such information, it indicates that the individual has received or will receive health care services or benefits from the covered entity.
HHS guidance states that authenticated pages will likely contain many forms of PHI, making them subject to HIPAA. It also lists some situations when unauthenticated pages include PHI. These include registration pages where an individual creates a login, or pages addressing specific symptoms or health conditions. The bulletin also mentions that mobile apps contain PHI provided by the app user and their devices, such as geolocation or device ID.
You can’t set GA4 tags on any pages that may fit the definitions provided in the HHS bulletin.
This setup is not HIPAA-compliant.
Another implementation involves using server-side Google Tag Manager (ssGTM) with GA4.
Since you’re not allowed to send PHI to Google Analytics, you must strip all PII/PHI from the data before sending it to GA4.
Server-side GTM, when properly set up, helps you control what data you share with Google. User data is only sent to the server hosting the GTM container rather than being shared with multiple third-party servers. You can remove any PII within the server container before passing the data on to marketing partners.
However, you’ll face two types of issues with this implementation.
The ssGTM issue involves the de-identification process being complicated and prone to errors.
HIPAA’s Privacy Rule provides two methods by which health information can be designated as de-identified:
- The Expert Determination method consists of a person with appropriate knowledge and experience applying statistical or scientific principles to determine that the risk is very small for the information to be used or combined to identify an individual.
- The Safe Harbor method includes removing all 18 types of HIPAA identifiers. Additionally, the covered entity must attest that the information couldn’t be used alone or combined to identify an individual.
The Privacy Rule no longer protects de-identified health information created following these methods because it does not fall within the definition of PHI.
That said, it’s unlikely that you’ll be able to strip all PHI.
IP addresses and device IDs can be easily removed with ssGTM. However, URLs are more complicated to de-identify because you collect a URL title on every visit. The title can contain sensitive information, like the doctor’s name and specialization or a patient’s name, or you can collect search parameters in link decorations.
There are also issues with de-identifying custom dimensions, variables, and event attributes that you assign PHI to. For example, you may track a healthcare app and collect a custom event when someone clicks on a doctor’s image. The event collects the doctor’s name and specialization, which may lead to uncovering the individual’s health issue, thus making this data PHI.
The above-described problem is caused by PHI not being properly filtered out before getting passed to GA4.
Another aspect is the GA4 issue.
“You will not and will not assist or permit any third party to pass information, hashed or otherwise, to Google that Google could use or recognize as personally identifiable information, except where permitted by, and subject to, the policies or terms of Google Analytics features made available to You, and only if, any information passed to Google for such Google Analytics feature is hashed using industry standards.”
As a result, you can’t send PII to GA4 – and PHI is a subset of PII.
Some people say that you can still safely analyze such data in GA4, and these terms don’t apply because:
- You can host ssGTM on the HIPAA-compliant infrastructure of your choice.
- If you de-identify data, it’s no longer considered PHI.
But there is a lot at stake here. As a HIPAA-covered entity, consult your legal team before implementing this option.
Find out more about Google Analytics and HIPAA: Is Google Analytics HIPAA-compliant?
This setup may be HIPAA compliant if you take certain steps.
Another option involves combining ssGTM with BigQuery and a data visualization tool.
This type of setup will only be affected by the ssGTM issue with the difficult de-identification process. But this problem can be mitigated when you work with a HIPAA-compliant data collection tool.
For example, you can set up ssGTM with different tech, including a data collection system, and transfer events directly into BigQuery. With this setup, the data would never be sent to Google Analytics servers and only be recorded in BigQuery, which is HIPAA-compliant. You can store the raw data and access it with a BI tool such as Looker Studio or Tableau.
Streaming events from server-side GTM to BigQuery is relatively straightforward. However, as the setup is just a simple event stream, it lacks the processing that analytics tools have. Because of this, things like user-level fields are missing. However, it’s always possible to process the data later once it hits BigQuery.
- Loads of maintenance needed, which leads to inflated data team costs.
- De-identification will most likely be necessary with ssGTM, depending on downstream technologies’ compliance with HIPAA. It’s a complex and time-consuming process that requires stricter organizational measures.
- ssGTM lacks transparency – there is no way for end-users to monitor or make decisions about data processing.
- A lot of talent on the market is proficient at using Google’s products and can support your implementation.
- The setup with ssGTM and BigQuery is quite popular.
- You have the flexibility of your own data warehouse.
Adobe is the second-biggest enterprise analytics player on the market.
Adobe offers a few products that can help you improve healthcare experiences while protecting patient privacy:
- Adobe Analytics (AA) is an analytics and reporting solution that monitors user traffic and interactions across various marketing channels. AA is highly customizable, offers flexible segmentation, and lets you see your whole conversion funnel and create predictive insights. The downside to using Adobe Analytics is its complexity. It’s hard to implement and costly since it will typically require the services of a specialized agency. It has a steep learning curve, hence there aren’t many experienced users on the market.
- Adobe Customer Journey Analytics (CJA) lets you connect and normalize cross-channel data into actionable profiles, explore the customer journey in full context and apply AI-driven insights to deliver personalized experiences at scale. CJA is closer in type and capabilities to GA4. Concerning HIPAA, CJA can easily identify and secure PHI and PII, apply access rules, and create data use audits so you can feel confident that patient data is being handled in a compliant way.
- Adobe Launch is a tag management system and part of Adobe Experience Manager.
- Adobe Real-Time Customer Data Platform (CDP) connects customer data from all your channels into unified profiles that support discovering insights and delivering personalized experiences.
So, do Adobe’s products help you comply with HIPAA?
Providing PHI to Adobe is compliant only if it concerns a HIPAA-ready service, following the license agreement and BAA between Adobe and its client. To check which Adobe’s services are compliant, you can check this list of Adobe’s HIPAA-ready products.
Two analytics setups have been implemented on the market using Adobe’s products:
This setup is not HIPAA compliant.
Adobe Analytics is not listed as HIPAA-ready on Adobe’s site. It means that Adobe won’t sign a BAA with you to use AA. As a result, you are not permitted to create, receive, maintain, or transmit PHI through Adobe Analytics.
This setup is HIPAA compliant.
Adobe CJA is on the HIPAA-ready list, so you can safely use it as a HIPAA-covered entity and send PHI to it. This setup can be complemented with Adobe CDP for audience creation and activation.
However, since the only way to achieve HIPAA compliance with Adobe is by using CJA, note that this tool’s main advantage is integration with other components in the Adobe Experience Platform. By itself, CJA is far less advanced than AA.
- You are faced with high implementation and subscription costs.
- Adobe’s analytics products are difficult to learn and use.
- You risk single-vendor lock-in due to the amount of other tightly integrated products offered by Adobe.
- You can sign a BAA.
- You get an all-in-one analytics solution.
Piwik PRO gives you accuracy, flexibility, and complete control when collecting and analyzing customer data.
Most importantly, we’ve created our suite of products with privacy and security in mind to reduce compliance issues. Because of that, we can easily support your analytics use cases in healthcare.
Here is an overview of our modules:
- Analytics allows you to analyze the customer journey across websites and apps. You can use advanced analytics features like funnels, user flows, customizable reports and dashboards. And you can always extend the platform’s capabilities through custom development and integrations. You can use raw data exports to send data to any destination. Increased security features allow you to use Analytics in sensitive industries, like healthcare.
- Tag Manager lets you quickly create, test, and deploy tags from customizable templates. You gain greater flexibility in collecting and utilizing their data through smooth integration with other Piwik PRO modules.
- Customer Data Platform (CDP) enhances your ability to act on the insights you draw from your data. You can better understand your customers, provide more personalized experiences, and improve your campaigns.
- Consent Manager is an optional addition for increased transparency, allowing you to collect, manage, and store user consents.
We included HIPAA-related features and controls when developing our product and security policies.
As a result, all of our modules help you comply with HIPAA. We will sign a BAA with you, allowing you to send all types of PHI to your analytics setup. This means you don’t need to take on the cumbersome task of de-identifying PHI.
Piwik PRO also helps you comply with the HHS bulletin on the use of tracking technologies.
Piwik PRO automatically collects some PHI, like sub-state location, page URLs and IP addresses. Other PHI may be collected if you set it as a user ID or custom event.
Additional key features of Piwik PRO that strengthen HIPAA compliance include:
- Hosting on HIPAA-compliant Microsoft Azure data centers, where you can choose the specific location of your data.
- ISO 27001 and SOC 2 type II certifications.
- Encryption of ePHI when the data is at rest and in transit.
- Advanced user-permission options that let you put PHI only in the hands of authorized personnel.
- Not sharing ePHI with third parties or reusing it for other purposes.
- Regular privacy and security audits undertaken by external, independent bodies to ensure the highest level of security measures.
Recommended ways for you to implement Piwik PRO modules include:
This setup is HIPAA-compliant.
With this option, you can safely collect and analyze PHI and ePHI while respecting the highest privacy and security safeguards. You can analyze the customer journey across all channels, control data collection and adjust it to your needs, and you get to activate the data to improve the patient experience.
This setup is HIPAA-compliant.
This is a point solution for marketers, combining the capabilities of analytics and activation. You can connect our suite of products with a data warehouse via scheduled raw data exports or API, allowing you to extend the platform’s data analysis functionalities.
Learn more about How to make your website compliant with HIPAA using Piwik PRO.
- You are using tools from one vendor only.
- There is a client-side tag manager.
- You can sign a BAA.
- You get an all-in-one analytics solution.
- The costs are low.
- CDP is available for server-side profile activations.
- You have the ability to use Piwik PRO as an analytics endpoint in server-side tracking, which improves data collection, accuracy and control.
- The modules are easy to learn and use thanks to the similarity to the Universal Analytics interface.
This setup may be HIPAA compliant if you take certain steps.
Combining tools from different vendors can get complex. You need to assess your needs very well, understand what each tool offers, and check how it can help you comply with HIPAA.
Generally, your analytics setup should include the following tools:
Data collection system + data warehouse + data visualization tool
We list some popular data collection systems below and link to the relevant information regarding their HIPAA compliance. Aside from that, you will need to verify their specific HIPAA compliance yourself.
A data collection system facilitates the process of data collection, subsequently enabling you to perform data analysis on the information.
There are different flavors of data collection tools. For example, we can distinguish between customer data platforms (CDPs) like Segment and behavioral data platforms (BDPs) like Snowplow.
But data collection systems can get even more complex. For example, Tealium now dominates as a CDP, though it started as an enterprise-grade tag management solution and then went into the data streaming category.
All in all, these vendors offer more than just pure tracking, meaning you need to make a separate assessment of your needs and how these tools can fulfill them.
- Snowplow – no need for a BAA in the self-hosted version (it’s not certain whether the vendor would sign a BAA for Cloud)
A data warehouse holds data that is extracted, loaded, and transformed from one or more operational source systems and modeled to enable data analysis and reporting in your business intelligence (BI) tools.
- Google Cloud Platform (such as Google BigQuery)
- Microsoft Azure (such as Microsoft Azure Data Synapse)
- Amazon Web Services (such as Amazon Redshift)
A data visualization tool enables the visual representation of data, allowing for the effective extraction of actionable insights from the data.
- Piwik PRO (data collection, visualization, and CDP) + data warehouse (data copy for science team) + Looker Studio or Tableau (broad data visualization)
- Adobe CJA + CDP + AEP (data collection, activation, and visualization)
- Rudderstack (data collection, CDP) + data warehouse + data visualization tool
Most data collection vendors allow for GA4 as a destination, so the flow can also look like this:
A data collection system + GA4
However, this setup won’t do the trick for HIPAA-covered entities because it falls under the GA4 issue.
- You need to review the HIPAA compliance of each vendor – analyze security and privacy, manage and negotiate cooperation with all three selected vendors, sign a BAA with each of them, etc.
- The connection between the systems may not be seamless – changes or API updates in each of those vendors may break your setup.
- You would require a data analyst or database expert to manage and maintain pipelines.
- The costs are very high – you need to pay for implementation, licensing of multiple vendors, and maintenance.
- You benefit from diversification of vendors, meaning no-vendor lock-in.
- You can combine the benefits and features of each system you implement.
As you can see, despite Google Analytics being the most popular analytics solution on the market, it is full of risks and complications for HIPAA-covered entities. The other options that we’ve discussed offer much more certainty.
Ultimately, we’ve combined the viable solutions discussed above and compared their most important features in the table below:
|Piwik PRO||Adobe||ssGTM + data warehouse + data visualization tool||Mix of vendors|
|Ease of implementation|
|Diversification of vendors|
|Covers more use cases than “analytics+activation for healthcare”|
|Raw data access|
If you want to learn more about how Piwik PRO can support you in being HIPAA-compliant, reach out to us: