GDPR Compliance

Overview

GDPR establishes strict rules for personal data processing, requiring: (1) Lawful basis for processing (consent, contract, legitimate interest), (2) Data minimization (collect only necessary data), (3) Purpose limitation (use data only for stated purpose), (4) Storage limitation (delete when no longer needed), (5) Security measures, (6) User rights (access, correction, deletion, portability). For AI: Training data must comply with GDPR. Models trained on personal data inherit compliance obligations. Outputs may contain personal information requiring protection. Automated decisions need human oversight or explanation (Article 22). LLMs pose challenges: How to delete training data? How to ensure right to be forgotten? How to provide explanations for black-box models?

GDPR Requirements for AI Systems

Legal basis: Consent, contract, or legitimate interest for data processing
Data Processing Impact Assessment (DPIA): Required for high-risk AI systems
Right to explanation: Users can request explanation of automated decisions
Data minimization: Collect minimum necessary data for AI training/inference
Anonymization: Remove personally identifiable information from datasets
Data portability: Users can export their data in machine-readable format
Right to be forgotten: Delete user data from systems (including trained models)
Cross-border transfers: Special rules for data leaving EU (adequacy decisions, SCCs)

Technical Implementation

Encryption: AES-256 for data at rest, TLS 1.3 for data in transit
Anonymization: k-anonymity, differential privacy, synthetic data generation
Access controls: Role-based access (RBAC), audit logs, MFA
Data retention: Automated deletion after retention period
Consent management: Track and enforce user consent preferences
Model governance: Version control, lineage tracking, audit trails
Pseudonymization: Replace identifiers with pseudonyms for processing
Secure enclaves: Process sensitive data in isolated environments (SGX, TEE)

Code Example

# GDPR-compliant data handling example
import hashlib
from cryptography.fernet import Fernet
from datetime import datetime, timedelta

class GDPRCompliantDataHandler:
    def __init__(self, encryption_key):
        self.cipher = Fernet(encryption_key)
        self.consent_db = {}  # In production: use proper database
        self.retention_periods = {"analytics": 90, "training": 365}
    
    def pseudonymize(self, user_id: str) -> str:
        """Replace user ID with irreversible pseudonym"""
        return hashlib.sha256(user_id.encode()).hexdigest()
    
    def encrypt_pii(self, data: str) -> bytes:
        """Encrypt personally identifiable information"""
        return self.cipher.encrypt(data.encode())
    
    def decrypt_pii(self, encrypted_data: bytes) -> str:
        """Decrypt data (only when necessary)"""
        return self.cipher.decrypt(encrypted_data).decode()
    
    def check_consent(self, user_id: str, purpose: str) -> bool:
        """Verify user consent for specific purpose"""
        consent = self.consent_db.get(user_id, {})
        return consent.get(purpose, {}).get("granted", False)
    
    def record_consent(self, user_id: str, purpose: str, granted: bool):
        """Record user consent with timestamp"""
        if user_id not in self.consent_db:
            self.consent_db[user_id] = {}
        self.consent_db[user_id][purpose] = {
            "granted": granted,
            "timestamp": datetime.now().isoformat()
        }
    
    def should_delete(self, data_created: datetime, purpose: str) -> bool:
        """Check if data exceeds retention period"""
        retention_days = self.retention_periods.get(purpose, 365)
        return datetime.now() > data_created + timedelta(days=retention_days)
    
    def export_user_data(self, user_id: str) -> dict:
        """Right to data portability (Article 20)"""
        # Collect all user data from all systems
        return {
            "user_id": user_id,
            "consent_records": self.consent_db.get(user_id, {}),
            "exported_at": datetime.now().isoformat(),
            "format": "JSON"
        }
    
    def delete_user_data(self, user_id: str):
        """Right to be forgotten (Article 17)"""
        # Delete from all systems
        if user_id in self.consent_db:
            del self.consent_db[user_id]
        # In production: also delete from:
        # - Databases, S3 buckets, logs, backups
        # - Training datasets (if feasible)
        # - Notify third parties

# Usage example
key = Fernet.generate_key()
handler = GDPRCompliantDataHandler(key)

# Record user consent
user_id = "user123"
handler.record_consent(user_id, "analytics", True)
handler.record_consent(user_id, "marketing", False)

# Check consent before processing
if handler.check_consent(user_id, "analytics"):
    # Process analytics data
    pseudonym = handler.pseudonymize(user_id)
    encrypted_email = handler.encrypt_pii("user@example.com")
    print(f"Pseudonymized ID: {pseudonym}")

# Export user data (portability)
user_data = handler.export_user_data(user_id)
print(f"Exported data: {user_data}")

# Delete user data (right to be forgotten)
handler.delete_user_data(user_id)
print(f"User {user_id} data deleted")

Penalties & Enforcement

GDPR violations incur significant fines: Tier 1 (up to €10M or 2% global revenue) for procedural violations. Tier 2 (up to €20M or 4% global revenue) for serious violations like unauthorized processing or inadequate security. Notable AI-related fines (as of October 2025): €746M to Amazon (2021), €90M to Google (2022), various €10-50M fines for facial recognition, automated decision-making without consent, and inadequate data protection. Enforcement by national Data Protection Authorities (DPAs). Companies should conduct DPIAs, maintain documentation, appoint DPOs for large-scale processing, and implement privacy by design.

GDPR vs AI Act

GDPR (2018): Governs personal data processing, applies to all systems handling EU data. EU AI Act (2024): Regulates high-risk AI systems regardless of data type. Overlap: Both require transparency, documentation, human oversight for automated decisions. Differences: GDPR focuses on data rights, AI Act focuses on system safety and accountability. For AI systems: Must comply with both. GDPR for training data and user information, AI Act for high-risk applications (employment, credit scoring, law enforcement). Combined compliance burden significant but necessary for EU market access.

Professional Integration Services by 21medien

21medien offers GDPR compliance services for AI systems including Data Protection Impact Assessments (DPIA), technical implementation (encryption, anonymization, consent management), documentation, and audit preparation. Our team helps organizations achieve GDPR compliance while building effective AI systems. For detailed GDPR guidance, see our blog post: GDPR Compliance for AI Systems (post #03). Contact us for compliance consulting.

Resources

GDPR official text: https://gdpr.eu | EU AI Act: https://artificialintelligenceact.eu | DPA list: https://edpb.europa.eu | Blog post: /en/blog/gdpr-compliance-ai-systems

Overview

GDPR Requirements for AI Systems

Technical Implementation

Code Example

Penalties & Enforcement

GDPR vs AI Act

Professional Integration Services by 21medien

Resources

Official Resources

Related Technologies

Encryption

AWS Secrets Manager

HashiCorp Vault

Cookie Settings

Necessary Cookies

External Services