Data Privacy Laws in 5 Minutes
Data privacy regulations exist because organizations consistently fail to track where sensitive information lives. The technical challenge boils down to inventory management. The operational challenge is that most data teams don’t maintain inventories until regulators demand them.
Understanding the regulatory landscape is straightforward. Implementing compliant systems requires translating legal requirements into operational practice. Here’s what you need to know about the laws that govern personal data, and why your data architecture needs to account for them from day one.
Welcome to The Data Letter 👋🏿 👋🏿👋🏿
I’m Hodman Murad, and I help data teams build reliable, scalable systems.
Regulatory Landscape
GDPR set the gold standard in 2018. If you process data of EU residents, you’re subject to fines up to €20 million or 4% of global annual revenue, whichever is higher. It introduced mandatory break notifications within 72 hours, the “right to be forgotten,” and the concept of “data minimization.”
CCPA and CPRA gave California residents similar rights starting in 2020, with CPRA strengthening protections in 2023. The threshold is low: businesses that buy, sell, or share the personal information of 100,000 or more California consumers or households, or derive 50% or more of their annual revenue from selling or sharing personal information. Most mid-size companies hit these numbers without realizing it.
HIPAA has been around since 1996, but modern data architectures make compliance harder, not easier. Every system that touches protected health information needs encryption, audit logs, and access controls. Your data lake doesn’t get a pass.
SOX isn’t technically a privacy law, but Section 404 means financial data pipelines. Need documented controls and audit trails. If your transformation logic changes revenue calculations, you need to prove who changed what and when.
The regulatory landscape extends far beyond these flagship laws. Globally, 144 countries have enacted national data privacy laws, creating a complex web of requirements. Brazil’s LGPD mirrors GDPR principles with fines up to 2% of revenue, while China’s PIPL imposes strict consent and data localization rules with penalties reaching 5% of global turnover. Canada’s PIPEDA has governed commercial data use since 2001, Japan’s APPI has been regularly updated to reflect changing technology, and India’s DPDPA came into force in 2023 with penalties up to ₹250 crore (approximately $30 million).
In the United States alone, 19 states have passed comprehensive privacy laws as of 2024, including California, Virginia, Colorado, Connecticut, Utah, Montana, Oregon, Texas, Iowa, Indiana, Tennessee, Delaware, New Hampshire, New Jersey, Maryland, Minnesota, Kentucky, Nebraska, and Rhode Island. Adding sector-specific regulations, such as COPPA for children’s data and GLBA for financial institutions, involves navigating dozens of overlapping requirements, each with distinct definitions, thresholds, and enforcement mechanisms.
Why Data Teams Should Care
“That’s a legal problem” is the wrong answer. When regulators investigate, they audit your data systems, not your privacy policy. They want to see:
Data inventory: Can you list every system that contains PII?
Lineage tracking: Can you trace personal data from ingestion to deletion?
Access logs: Who queried customer data last month?
Retention policies: Why is data from 2015 still in your warehouse?
Breach detection: How quickly can you identify unauthorized access?
The companies that get fined aren’t necessarily the ones with bad intentions. They’re the ones with zombie DAGs processing data nobody remembers collecting, undocumented ingestion pipelines, and tables nobody owns.
Hidden Liability in Your Pipelines
Consider a typical data flow: application databases → event streams → data lake → analytics warehouse → ML features → dashboards. At each stage, PII can multiply:
Raw event logs capture full payloads, including sensitive fields.
ETL jobs join customer tables, creating new combinations of identifiable information.
Feature engineering creates derived PII (zip code, birth year, and gender form a quasi-identifier).
Materialized views persist data longer than retention policies allow.
Failed job runs leave PII in error logs and temporary tables.
Your dbt models might be perfectly documented, but what about the ad hoc Python scripts your analysts run? The Jupyter notebooks that never made it to production? The CSV exports sitting in S3 from that one-time analysis in 2022?
When Compliance Breaks Down
Here’s what keeps regulators up at night: companies that can perform complex data transformations but can’t answer basic questions like, “Where is customer data stored?” or, “How do we delete a user’s information?”
The technical sophistication exists. Most data teams use tools that could provide comprehensive lineage, access control, and audit logging. The gap is operational. Turning theoretical capability into consistent practice.
When a breach occurs or a regulator investigates, you don’t get time to figure it out. You need runbooks. You need detection mechanisms. You need processes that run whether or not someone remembers to trigger them.
Moving from Awareness to Action
Knowing the regulations is essential, but compliance requires more than awareness. The gap between understanding privacy laws and actually implementing them sits in your operational practices: the runbooks, detection mechanisms, and processes that run whether or not someone remembers to trigger them.
The question isn’t whether your team has the technical capability. Most data teams already use tools that could provide comprehensive lineage, access control, and audit logging. The question is whether you’ve translated that capability into consistent practice before a breach forces your hand.
Thanks for reading,
Hodman Murad
References
GDPR-Info.eu - Fines/Penalties
GDPR.eu - What are the GDPR Fines?
GDPR-Info.eu - Article 33: Notification of Personal Data Breach
California Attorney General - CCPA FAQs
CDC - Health Insurance Portability and Accountability Act of 1996
SecPod - Global Regulations and Best Practices for Data Compliance in 2025
TechLawSphere - Global data protection and privacy regulations: a status update for multinational companies
DPO Consulting - GDPR Countries in 2025: Which Nations Are Covered and Which Are Not?
Office of the Privacy Commissioner of Canada - The Personal Information Protection and Electronic Documents Act (PIPEDA)
Usercentrics - Japan’s Act on the Protection of Personal Information (APPI)
PDTN - Global Data Protection Laws: Your Complete Guide for 2025
Measure Minds Group - Data Privacy Laws in 2025: Current State & New Developments


"Moving from Awareness to Action" is great headline -:)
Thanks for sharing. It's about time data privacy is taken more seriously, if at all they will. If not for the fines, at least to avoid breaches.