Raw data vs. processed (aggregated) data

Raw data provides analytical flexibility but requires infrastructure, technical expertise, and resources. Most organizations choose processed data by default without fully understanding the tradeoffs.

Raw data refers to the most basic and unorganized form of data. It provides a comprehensive dataset for thorough analysis and offers flexibility to apply different filters and visualizations to derive new insights. However, raw data requires significant time, technical expertise, and computational resources to transform into actionable insights. Handling large datasets can be particularly resource-intensive in advanced analytics.

Processed data is raw data that has been transformed or analyzed to make it easier to interpret. It comes pre-aggregated and ready to use, but lacks the detail of raw data due to its condensed nature. Once data is aggregated, you can’t disaggregate it to ask different questions.

When raw data is necessary

You need raw data if: You’re running advanced analytics or machine learning models, conducting detailed user-level analysis for product development, need to comply with audit requirements that demand granular data, or integrate analytics data with other systems like CRMs or data warehouses.

Processed data works if: You’re primarily tracking high-level KPIs, have small analytics teams without data engineering resources, or need quick reporting without complex analysis.

Common mistake: Organizations export raw data “just in case” without having the infrastructure to actually use it. Raw data sitting unused in BigQuery still costs money in storage fees.

Key differences

State and organization: Raw data is unorganized and in its original form. Processed data is cleaned, organized, and summarized for immediate use.

Effort and resources: Raw data requires significant time and resources to process and analyze. Processed data is readily interpretable and easier to work with right away.

Detail and completeness: Raw data is complete and comprehensive, allowing for thorough analysis and the ability to ask questions you didn’t anticipate. Processed data is condensed and may lack detailed information needed for certain analyses.

Flexibility: Raw data offers flexibility for various analyses and reporting needs you haven’t yet defined. Processed data is tailored for specific interpretations determined at aggregation time.

Learn more about the use cases for raw data analytics from this post: How to use raw data in web analytics.


  • What is PII, non-PII, and personal data? [UPDATED]

    Personally identifiable information (PII) and personal data are two classifications of data that often confuse organizations that collect, store and analyze such data. Both terms cover common ground, classifying information that could reveal an individual’s identity directly or indirectly. PII is used in the US, but no specific legal document defines it. The legal system…

  • What is first-party data and how does it benefit your marketing strategy [Updated]

    First-party data is information a company collects directly from its customers through owned channels like websites, apps, transactions, and customer interactions. Unlike third-party data purchased from external sources, first-party data comes straight from your audience, making it more accurate, privacy-compliant, and valuable for personalized marketing. According to Acquia’s 2024 CX Trends Report, 93% of marketers…