What is PII, non-PII, and personal data? [UPDATED]

, ,

Written by Karolina Lubowicka, Karolina Matuszewska, Małgorzata Poddębniak

Published August 06, 2024

SUMMARY

  • PII is mainly applied in the US but lacks a single legal definition, while personal data has a legal meaning defined by the GDPR in the EU. Both types of data cover information that could directly or indirectly reveal an individual’s identity.
  • PII can be divided into a few categories, the most common of which are linked (e.g. name, SSN) and linkable (e.g. job position, non-specific age) information. Other types of PII include sensitive and non-sensitive PII.
  • Personal data includes any information relating to an identified or identifiable person, including online identifiers like cookies and IP addresses.
  • Companies must take measures to protect PII and personal data. Legal requirements are getting stricter, requiring organizations to closely examine the data they collect and stay compliant with changing regulations.

Personally identifiable information (PII) and personal data are two classifications of data that often confuse organizations that collect, store and analyze such data. Both terms cover common ground, classifying information that could reveal an individual’s identity directly or indirectly. 

PII is used in the US, but no specific legal document defines it. The legal system in the United States is a blend of numerous federal and state laws and sector-specific regulations, all of which describe and classify different pieces of information under the PII umbrella. 

On the other hand, personal data has one legal definition established by the General Data Protection Regulation (GDPR), which is accepted as law across the European Union (EU).

But why is all that so important? As a website admin, app creator or product owner, you need to be aware that visitors and users could share sensitive information with you. These traces might enable you to identify individuals, so you must handle such data carefully. From a legal standpoint, it could be a matter of breaches and violations with serious consequences. Grasping the bigger picture is crucial for your organization’s security and legal compliance.

What is personally identifiable information (PII)?

Personally identifiable information (PII) is often referenced by US government agencies and non-governmental organizations. However, as the US lacks one overriding law about PII, the legal definition of the term may vary from jurisdiction to jurisdiction and state to state.

The most common definition is provided by the National Institute of Standards and Technology (NIST), which states that:

It says that:

PII is any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.

However, the line between PII and other kinds of information is blurry. As stressed by the US General Services Administration, the “definition of PII is not anchored to any single category of information or technology. Rather, it requires a case-by-case assessment of the specific risk that an individual can be identified”.

What pieces of information are considered PII?

According to NIST, PII can be divided into linked and linkable information.

Linked information

Linked information can also be defined as direct identifiers. 

Direct identifiers are unique to a person and can be used to identify an individual. A single direct identifier is typically enough to determine someone’s identity.

Here are the examples of linked information:

examples of linked information

Linkable information

Linkable information concerns indirect identifiers or quasi-identifiers. They may not be able to identify a person on their own, but identification becomes possible when combined with another piece of information. For example, research shows that 87% of US citizens could be identified based on just their gender, ZIP code and date of birth. De-anonymization and re-identification techniques typically work when multiple sets of quasi-identifiers are connected and can be used to distinguish one person from another.

Here are some examples of PII that can be considered linkable information:

examples of linkable information

Sensitive vs. non-sensitive PII

Though more of a customary than regulatory distinction, we can also differentiate between sensitive and non-sensitive examples of PII.

Sensitive PII

Sensitive PII is information that can directly identify an individual and could result in harm to them if a data breach occurs. 

Sensitive PII is typically not publicly available. Many data privacy laws require organizations to safeguard it by encrypting it, controlling who accesses it and taking other security measures.

Examples of sensitive PII include:

  • Unique identification numbers, such as driver’s license numbers, social security numbers (SSN), passport numbers and other government-issued ID numbers.
  • Biometric data, such as fingerprints and retinal scans.
  • Financial information, including bank account numbers and credit card numbers.
  • Medical records.
  • Electronic and digital account information, such as email addresses and internet account numbers.
  • Employee personnel records.
  • Password information.
  • School identification numbers.

Non-sensitive PII

Non-sensitive PII is information that may or may not be unique to an individual person. This type of data can be transmitted without being encrypted, and disclosure of it will not cause harm to the individuals that the data concerns. 

Non-sensitive PII tends to be publicly available – for example, phone numbers can be listed in a phone book. 

Some data privacy regulations don’t require the protection of non-sensitive PII, but companies should still employ safeguards to limit the risks to individuals. 

Examples of non-sensitive PII include:

  • A person’s full name.
  • Mother’s maiden name.
  • Social media nickname.
  • Telephone number.
  • IP address.
  • Place of birth.
  • Date of birth.
  • Geographical details (ZIP code, city, state, country, etc.).
  • Employment information.
  • Email address or mailing address.
  • Race or ethnicity.
  • Religion.

How PII differs from PHI

Protected health information (PHI) includes information used in a medical context that can identify patients. PHI is a subset of PII that refers explicitly to information processed by HIPAA-covered entities. When health information is combined with a personal identifier, the data becomes PHI.

Identifiers recognized by HIPAA include:

examples of PHI

While protecting PII is mandated only in some instances, in order to protect patient privacy, PHI is subject to strict confidentiality requirements that don’t apply to most other industries.

The HIPAA Privacy Rule ensures that PHI is shared and used only with patient permission or to coordinate patient care and services between covered entities. Organizations covered by HIPAA, such as healthcare providers, hospitals, insurers and their business associates, must follow strict rules specifying the types of PHI they can collect from individuals, disclose with others, or use for marketing purposes.

Learn more about PHI and how to protect it to comply with HIPAA: PHI and PII: How they impact HIPAA compliance and your marketing strategy.

What is non-PII?

Non-personally identifiable information (non-PII) is data that cannot be used on its own to trace, or identify a person.

Examples of non-PII include, but are not limited to:

  • Aggregated statistics on the use of product/service.
  • Partially or fully masked IP addresses.

However, the classification of PII and non-PII is vague. Moreover, NIST doesn’t reference cookie IDs and device IDs, so many AdTech companies, advertisers, and publishers consider them non-PII. As we’ll see, this is in contrast to the definition of personal data, which treats such digital tackers as information that could identify an individual.

What is personal data?

Personal data is a legal term that the GDPR defines as the following:

Article 4(1):

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

This definition applies to a person’s name and surname, as well as details that could identify that person. That’s the case when, for instance, you’re able to identify a visitor returning to your website with the help of a cookie or login information.

Under GDPR, you can view cookies as personal data because, according to:

Recital 30:

Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.

The definition of personal data covers various pieces of information, such as:

Essentially, it’s any information relating to an individual or identifiable person, directly or indirectly.

What is non-personal data?

Following the GDPR provisions, non-personal data is data that won’t let you identify an individual. The best example is anonymous data. According to:

Recital 26:

The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.

Other examples of non-personal data include, but are not limited to:

  • Generalized data, such as age range.
  • Information gathered by government bodies or municipalities, such as census data or tax receipts collected for publicly funded works.
  • Aggregated statistics on the use of a product or service.
  • Partially or fully masked IP addresses.

Collecting anonymous data allows companies to gain useful analytics insights into user behavior without accessing personal data – this is possible with Piwik PRO Analytics Suite. 

Learn more about anonymous data collection with Piwik PRO: Anonymous tracking: How to do useful analytics without personal data.

How PII differs from personal data

Personal data encompasses a broader range of contexts than PII. In general, all PII is considered personal data, but not all personal data is PII. For example, attributes such as religion, ethnicity, sexual orientation or medical history can be categorized as personal data but not PII.

Personally identifiable information (PII)Personal data
Identification of individualsOften used to differentiate one individual from another.Includes any information related to a living individual, whether it distinguishes them from another individual or not.
Type of termNot a legal term, but commonly used in business.Legal term defined by the GDPR.
Legal coverageFeatured in various laws on different governmental and organizational levels.Covered by a single set of laws created and administered by a governing body.
Regulated informationMay regulate only specific kinds of information privacy and data access depending on the line of business, government department, etc.Regulates all facets of information privacy and use, from medical to commercial to personal.
Territorial applicationMost commonly applied in the US.Most commonly applied in the EU.
Legal characteristicEach organization or government provides specific laws and their enforcement.The term provides a unified approach to data security and privacy enforcement.
Approach to individuals’ rightsIndividual rights vary depending on the regulation. May or may not cover all potential individual rights regarding data.Under GDPR, individuals have a number of rights regarding their personal data, such as:
– The right to be informed.
– The right of access.
– The right to rectification.
– The right to erasure/to be forgotten.
– The right to restrict processing.
– The right to data portability.

While there is no single federal law governing the collection and use of PII, several legal documents and industry standards define its scope, such as:

Furthermore, both governmental and non-governmental organizations regulate the proper use of PII, including:

  • The Federal Trade Commission (FTC) and its Department of Consumer Protection.
  • Local Departments of Consumer Affairs.
  • The Federal Communications Commission (FCC).
  • The National Institute of Standards and Technology (NIST).
  • The Network Advertising Initiative (NAI), a self-regulatory organization.

Personal information (PI) and the CCPA

Personal information (PI) is used in the context of the California Consumer Privacy Act (CCPA). The CCPA establishes a very broad definition of personal information, which continues to function in the California Privacy Rights Act (CPRA).

The California law defines PI as:

Information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.

However, this doesn’t include information that has been made publicly available by the local, state, or federal government.

Here are the examples of personal information under CCPA:

examples of personal information under CCPA

As you can see, California classifies data like device IDs, cookies IP addresses, and even aliases and account names as personal information.

Why should you protect PII and personal data

Access to PII allows businesses to tailor product or content recommendations to their customers’ preferences. However, the growing volume of PII accumulated by organizations is attracting the attention of cybercriminals. Failure to protect users’ data leaves organizations exposed and at risk of attacks.

Stolen data containing PII can cause extensive harm to individuals. With just a few bits of an individual’s personal information, thieves can create false accounts in the person’s name, take out loans, create a falsified passport or sell a person’s identity to a criminal. 

As organizations collect, process and store PII, they must also accept responsibility for protecting this sensitive data.  After all, data breaches can happen to organizations of all sizes and industries – from major credit reporting bureaus to small banks – and the impacts on the organization are often the same.

A data breach can severely damage users’ trust in an organization, ultimately ruining its reputation and hindering business results. Not to mention the risk of non-compliance with privacy regulations, which can additionally lead to heavy fines. 

According to IBM’s Cost of a Data Breach 2023 Report, the average cost of a data breach caused by a ransomware attack was USD 5.13 million. As ESG’s report states, the amount of sensitive data is believed to have doubled in the period between 2021 and 2024. Moreover, around half of organizations believe that this data is not sufficiently secure. 

Given the many risks associated with data breaches, protecting PII and personal data is essential. Companies must navigate a complex IT and legal landscape to prevent future attacks and maintain scalable data protection frameworks. 

Aside from preventing breaches, emphasizing data security and privacy helps boost customer loyalty and trust, and futureproofs tech investments against evolving requirements.

How to protect PII

The main source of guidance on ways to secure PII is NIST’s Guide to Protecting the Confidentiality of Personally Identifiable Information

First, PII protection methods resemble those mentioned by the GDPR for personal data, such as:

  • Data minimization.
  • Privacy impact assessment.
  • Data encryption.
  • Data anonymization.

Not all PII needs to be protected equally. The necessary safeguards to apply will depend on the following factors:

  • How easily the PII can be tied to specific individuals.
  • The number of individuals whose PII is stored in the system.
  • The sensitivity of the data.
  • The context of how the data will be used, stored, collected or disclosed.
  • Legal obligations to protect the data.
  • The location of the data and level of authorized access to it.

NIST explains further that the protection of PII requires a combination of measures, such as “operational safeguards, security controls and safeguards related to privacy.”

Here is a breakdown of the recommended measures:

Operational safeguards

The protection of PII starts at an organization’s operational level. It involves creating and establishing detailed policies and procedures for managing PII. Some safeguards involve training employees about data breach risks and best practices for handling and protecting PII.

Privacy-specific safeguards help businesses follow the data minimization principle and let them use and maintain data without risking its confidentiality. Protecting PII confidentiality requires certain mechanisms, such as data anonymization and de-identifying information (encryption).

Security controls

NIST offers recommendations regarding security controls for protecting PII. They include:

  • Access enforcement – such as granting access to the data based on the user’s role.
  • Remote access – prohibiting or restricting access to PII and when a user has remote access and ensuring communication is encrypted.
  • Separation of duties – for example, users who handle de-identified PII shouldn’t also work in positions that grant them access to the information needed to re-identify the records.
  • Information system monitoring – monitoring PII for unusual or suspicious transfers or events.
  • Least privilege – making sure that users have access to only the data they need.
  • Audit review, analysis and reporting – conducting a regular review and analysis of records after the information system audit to spot any unusual activities affecting PII.

How to protect personal data

The GDPR sets out guidelines for protecting personal data. The most important ones include:

The principle of lawfulness

Most importantly, GDPR requires having a clear and valid reason for collecting and using personal data. The reason for processing data must be based on necessity. It could be, for instance, fulfilling a contract requirement or providing a service.

The principle of integrity and confidentiality

One of the key ways to protect data is to ensure its security, though GDPR doesn’t exactly say what this security should look like in practice. The choice of safeguards will differ between organizations. For instance, a hospital with sensitive information about its patients will take different steps than a blogger with a newsletter.

Data protection by design

Data protection by design means adopting technical and organizational measures in the initial design phases of processing operations.

Examples of these measures include:

  • Pseudonymization – replacing or removing information within a data set that enables identification of a specific person, using methods such as encryption, scrambling or masking.
  • Anonymization – removing personal information from a data set to make it impossible to identify a particular person.
  • Monitoring of data processing.
  • Adding new privacy features and improving existing ones.

Under GDPR, data pseudonymization techniques are not enough to provide full data anonymity.

Data protection by default

Data protection by default is based on the principles of data minimization and purpose limitation. 

Following the data minimization principle, data should be “adequate, relevant and limited to what is necessary”.

Purpose limitation means you specify your processing purpose, document it and inform individuals about this purpose before any processing starts. 

Organizations that act in line with those principles will collect only the minimum amount of data possible and keep it for only as long as necessary to fulfill the purpose for which it was collected.

Data protection impact assessment

GDPR recommends performing a data protection impact assessment (DPIA) when processing might pose a high risk to individuals. DPIAs help organizations lower risk by recognizing and mitigating possible threats.

Consider running a DPIA when you are:

  • Using a new technology.
  • Processing sensitive data on a large scale.
  • Doing systematic monitoring of public areas.

read also

Learn how to protect PII, non-PII and personal data

Everything from the detailed definition of each to practical approaches to collecting and working with different types of data

PII, non-PII and personal data: Staying up to date on data privacy regulations

The broad definitions of PII and personal data are evolving to cover more and more kinds of data. The differences between the two are also becoming less distinct. The legal requirements are getting stricter on both sides of the Atlantic.

Those changes will bring new challenges. For organizations of all kinds, this means taking a closer look at the data they collect and keeping up with the changing legal landscape to stay compliant.

If you want to learn more about how Piwik PRO helps you safely collect and analyze PII and personal data, reach out to us:

Related posts: