Compliance Component¶

The Compliance component handles GDPR data deletion workflows across the Alan platform. It provides a centralized system for managing member-data deletion requests and coordinating deletion across the member-data deletion chain, sequenced into phases.

For pipeline diagrams, ownership, and the Ops/Marmot review UI, see the GDPR Orchestrator Notion page ⧉. This README focuses on how to onboard a new component.

Model: phases + discriminants. Deletion is sequenced into 5 phases (when an asset is deleted) and gated by discriminants (what another crew reads to decide deletability). Plan your crew's work in phases. "Bucket" is the current code term (ComplianceDataBucketType, --bucket-type); it is being renamed to phase at the cutover. Until then, map your phase to the current bucket value:

Phase Name What Current bucket value

1 Standalone Leaf data, no discriminant/anchor role (new, no bucket yet)

2 Health Claims + Clinic internals health_claims, health_services

3 Affiliation enrollment, exemption, employment, policy (new)

4 Anchors long legal / accounting / prévoyance retention prevoyance_claims

5 Identity user, global_profile, auth (last of all) (new)

Discriminant = a model read to decide deletability → must block until every reader has run. Reference = a model read for display/enrichment only → warn, don't block.

🚀 Quick Start for Teams¶

If your team manages user data that needs to be deleted for GDPR compliance, follow these steps:

1. Add Your Component as a Dependency¶

First, update the backend/components/compliance/dependencies.yml file to include your component:

# Path: components/*
dependencies:
  - global_profile
  - your_component_name # Add your component here

This allows the compliance component to import and use code from your component.

Note: there is no dependency injection or autodiscovery. The yml entry is enforced by a flake8 lint (tools/flake8/modular_monolith/dependencies_declared_in_root.py); the actual imports live inline in gdpr_compliance_rules.py:53-58, 91-96. You will edit both the yml and the inline import in step 3. This could be a candidate for improvements if feedback from crews indicates DX issues on implementation of GDPR rules.

2. Implement Required Methods in Your Component¶

Create two methods in your component's business logic:

Method 1: Find Members Ready for Deletion¶

# Example: components/your_component/internal/business_logic/queries/gdpr_deletion.py

from components.compliance.public.entities import BucketEligibilityResult


def get_profiles_ready_for_deletion() -> BucketEligibilityResult:
    """
    Report your component's GDPR eligibility for this bucket.
    We use global profile IDs because user IDs are not unique across apps (FR/ES/BE).

    Returns a BucketEligibilityResult so the orchestrator can distinguish three states:
      - ready_for_deletion: this producer says delete these
      - not_ready_yet:      this producer knows these but they're still in retention
      - OUT_OF_SCOPE:       implicit — any profile NOT in either list

    OUT_OF_SCOPE is what lets the orchestrator combine multiple producers safely:
    a profile a producer knows nothing about will not silently block deletion the way
    the previous list[UUID] contract did via set.intersection.
    """
    # Your business logic here
    # Example:
    # - Check for expired contracts
    # - Verify no pending claims
    # - Ensure retention period has passed

    return BucketEligibilityResult(
        ready_for_deletion=[profile_id_1, profile_id_2, ...],
        not_ready_yet=[profile_id_3, ...],
    )

Method 2: Execute Member Deletion¶

# Example: components/your_component/internal/business_logic/actions/gdpr_deletion.py

import uuid
from typing import List

def delete_member_data(global_profile_id: uuid.UUID) -> None:
    """
    Delete all data related to a specific member.

    This method should permanently delete or anonymize all data
    related to the given global profile ID.
    See method above why we decide to user global profile ID.

    Args:
        global_profile_id: The global profile ID to delete data for
    """
    # Your deletion logic here
    # Example:
    # - Delete member records
    # - Anonymize historical data
    # - Remove PII from logs
    # - Update related tables

    pass

3. Register Your Methods in Compliance Rules¶

Update the compliance rules to include your methods:

The registry dict keys (ComplianceDataBucketType.*) are renamed to phases at the cutover; the shape stays the same.

# File: components/compliance/internal/business_logic/rules/gdpr_compliance_rules.py

# Update the appropriate data bucket with your methods
def get_callable_rules_to_get_profiles_to_record_deletion(
    bucket_type: ComplianceDataBucketType,
) -> list[EligibilityCheck]:
    # Import your method here
    from components.your_component.internal.business_logic.queries.gdpr_deletion import (
        get_profiles_ready_for_deletion as your_component_get_profiles_ready_for_deletion,
    )

    compliance_data_buckets_methods: dict[
        ComplianceDataBucketType, list[EligibilityCheck]
    ] = {
        ComplianceDataBucketType.health_claims: [
            your_component_get_profiles_ready_for_deletion,  # Add your method here if related to health claims
        ],
        ComplianceDataBucketType.health_services: [
            # Add methods for health services if applicable
        ],
        ComplianceDataBucketType.prevoyance_claims: [],
        ComplianceDataBucketType.medical_data: [],
    }
    # ... rest of the method

def get_callable_rules_to_delete_profiles(
    bucket_type: ComplianceDataBucketType,
) -> list[Callable[[Any], list[uuid.UUID]]]:
    # Import your method here
    from components.your_component.internal.business_logic.actions.gdpr_deletion import (
        delete_member_data as your_component_delete_member_data,
    )

    compliance_data_buckets_methods: dict[
        ComplianceDataBucketType, list[Callable[[Any], list[uuid.UUID]]]
    ] = {
        ComplianceDataBucketType.health_claims: [
            your_component_delete_member_data,  # Add your deletion method here if related to health claims
        ],
        ComplianceDataBucketType.health_services: [
            # Add deletion methods for health services if applicable
        ],
        ComplianceDataBucketType.prevoyance_claims: [],
        ComplianceDataBucketType.medical_data: [],
    }
    # ... rest of the method

4. Choose Your Phase + Declare Discriminants/References¶

Pick the phase your data belongs to (see the phase table at the top of this README), then declare which of your models other crews gate on (discriminants, must block) versus merely read for display (references, warn).

Until the cutover renames the enum, register under the matching bucket value:

Phase 2 Health → health_claims (Claims), health_services (Clinic)
Phase 4 Anchors → prevoyance_claims
Phases 1 / 3 / 5 have no bucket value yet, coordinate with the orchestrator owners.

5. Declare your deletion contract (ordering)¶

The orchestrator derives a safe deletion order from declared contracts (see internal/business_logic/rules/discriminator_ordering.py). Each producer declares a DeletionContract over DeletionAnchor models:

owns: the anchors your producer deletes.
reads_as_discriminator (blocks): anchors you read to decide deletability. An anchor must not be deleted until everyone who reads it has run, so a reader is ordered before the owner. An anchor you both own and read (self-read) imposes no cross-producer order.
reads_as_reference (warns): anchors you read only for display/enrichment. Deleting one blanks a field you render but never changes deletability, so references impose no ordering and are not coverage gaps. They are reported by get_reference_reads() so ops can see "deleting anchor X would blank fields for these producers". The runtime warning that fires on actual deletion is a later increment.

compute_deletion_phases() topologically sorts producers into numbered phases (producers within a phase run in parallel). Anchors you read as a discriminator that no producer owns yet are reported by get_unowned_discriminators() as coverage gaps, the signal that an owner crew still needs to declare ownership and wire deletion.

Example: Claims renders a member's address on care summaries but doesn't own it, so it would declare the read as a reference (warn, not block):

HEALTH_CLAIMS_DELETION_CONTRACT = DeletionContract(
    producer=ComplianceDataBucketType.health_claims,
    owns=frozenset(),
    reads_as_discriminator=frozenset({DeletionAnchor.insurance_profile, ...}),
    reads_as_reference=frozenset({DeletionAnchor.address}),  # illustrative
)

The reference anchors (address, iban, contact) are not declared yet — owners add them once verified against real read patterns, each with its DELETION_ANCHOR_TABLES entry (the grounding test enforces real table names).

Onboarding a new producer means adding it to the dispatch dicts in gdpr_compliance_rules.py and declaring its DeletionContract in get_deletion_contracts(); a parity test guards the two from drifting. Adding a new producer bucket also means adding the enum member, its dispatch entry, and its contract together, never a bucket with no producer (that would create ops-selectable dead-end records).

Ordering is derived and numbered; the named phases (Standalone / Health / Affiliation / Anchors / Identity) are a communication taxonomy that lives in Notion, not in code.

Workflow Steps¶

Identification: Teams implement methods to identify members ready for deletion
Record Creation: Deletion records are created for eligible members
Batch Creation: Records are grouped into batches for review
Review Process: Batches are reviewed and approved/rejected
Execution: Approved batches trigger async deletion jobs
Completion: Data is permanently deleted across all systems

Key Components¶

GdprDeletionRecord: Individual deletion request for a member. Lifecycle tracked via accepted_for_deletion (None/True/False), deletion_applied_at (timestamp), deletion_error.
GdprDeletionBatch: Collection of deletion records for Ops review. Status: pending → accepted | rejected.
ComplianceProfile: Links global profiles to compliance processes

🛠 Available Commands¶

--bucket-type is renamed to --phase at the cutover; values map per the phase table at the top.

Create Deletion Records¶

# Identify and create deletion records for a data bucket
alan compliance create-gdpr-deletion-records-for-data-bucket --bucket-type health_claims

# Dry run mode (recommended for testing)
alan compliance create-gdpr-deletion-records-for-data-bucket --bucket-type health_claims --dry-run

Create Review Batch¶

# Group deletion records into a batch for review
alan compliance create-gdpr-deletion-batch-for-data-bucket --bucket-type health_claims

# Dry run mode
alan compliance create-gdpr-deletion-batch-for-data-bucket --bucket-type health_claims --dry-run

Review and Approve Batches¶

# Programmatically review batches
from components.compliance.internal.business_logic.actions.gdpr_deletion_batch import (
    review_gdpr_deletion_batch
)
from components.compliance.public.enums import GdprDeletionBatchStatus

# Approve a batch
review_gdpr_deletion_batch(
    deletion_batch_id=batch_id,
    reviewed_status=GdprDeletionBatchStatus.accepted,
    reviewed_by="reviewer_name",
    reviewed_reason="All checks passed"
)

🔍 Business Logic Guidelines¶

Finding Members Ready for Deletion¶

Your get_profiles_ready_for_deletion() method should consider:

Retention periods: Legal requirements for data retention
Active relationships: No ongoing contracts or claims
Grace periods: Allow time for member to return
Dependencies: Check for data used by other systems

Implementing Safe Deletion¶

Your delete_member_data() method should:

Be idempotent: Safe to call multiple times
Handle errors gracefully: Don't fail the entire batch
Log actions: Track what was deleted for audit
Preserve audit trails: Keep minimal records for compliance

🧪 Testing¶

Writing Tests for Your Integration¶

# Test your deletion identification logic
def test_get_profiles_ready_for_deletion_should_return_eligible_profiles():
    # Create test data
    expired_member = create_expired_member()
    active_member = create_active_member()

    # Test your method
    eligible_profiles = get_profiles_ready_for_deletion()

    # Assertions
    assert expired_member.global_profile_id in eligible_profiles
    assert active_member.global_profile_id not in eligible_profiles

# Test your deletion logic
def test_delete_member_data_should_remove_all_data():
    # Create test member with data
    member = create_member_with_data()

    # Execute deletion
    delete_member_data(member.global_profile_id)

    # Verify data is deleted
    assert not member_data_exists(member.global_profile_id)

📊 Monitoring and Observability¶

The system automatically logs:

Number of profiles identified for deletion
Batch creation and review status
Job execution status and failures
Deletion completion metrics

🚨 Important Considerations¶

Data Safety¶

Always test in staging first
Use dry-run mode for validation
Implement proper backups before deletion
Consider soft deletion for reversibility

Performance¶

Batch operations efficiently
Implement pagination for large datasets
Consider database locks and transactions
Monitor job execution times

Legal Compliance¶

Verify retention requirements
Document deletion policies
Maintain audit logs
Handle cross-border data requirements

📞 Support¶

For questions or issues with GDPR deletion integration:

Check the existing implementations in other components
Review the test cases for examples
Consult with the compliance team for legal requirements
Reach out to the platform team for technical guidance

Phase	Name	What	Current bucket value
1	Standalone	Leaf data, no discriminant/anchor role	(new, no bucket yet)
2	Health	Claims + Clinic internals	`health_claims`, `health_services`
3	Affiliation	enrollment, exemption, employment, policy	(new)
4	Anchors	long legal / accounting / prévoyance retention	`prevoyance_claims`
5	Identity	`user`, `global_profile`, auth (last of all)	(new)