Skip to content

User Lifecycle

The User Lifecycle system ensures every user (Alaner, External User, Service Account) has the correct access across 30+ providers (Google, Slack, AWS, GitHub, etc.) by continuously reconciling target state (our internal data model) with actual state (provider APIs).

It operates through three complementary mechanisms:

flowchart LR
    subgraph Daily
        T[Tasks] -->|per user| R[Reconcile state]
        C[Anomaly Checks] -->|per provider| D[Detect drift]
    end
    subgraph Semi-annual
        Rev[Reviews] -->|all providers| G[Generate spreadsheets]
    end
Hold "Alt" / "Option" to enable pan & zoom

Tasks

A task is a unit of work that runs for a single user against a single provider. Tasks are executed daily for every user in the system.

Principle

Each task compares the current state of a user in a provider (e.g., does the Google account exist? what roles does it have?) with the target state derived from our internal data model (role grants, role grant rules, employment dates). If a discrepancy is found, the task corrects it.

flowchart TD
    A[Daily batch job] -->|enqueue one job per user| B[process_user_lifecycle_tasks_for]
    B --> C{For each active task}
    C --> D[should_run?]
    D -->|yes| E[run: provision / update roles / deprovision]
    D -->|no| F[skip]
    E -->|success| G[Report to Slack]
    E -->|needs human input| H[Pause + Slack buttons]
    E -->|error| I[Report failure]
    I -->|N consecutive fails| J[Auto-deactivate task]
Hold "Alt" / "Option" to enable pan & zoom

Task types

Type When it runs What it does
UserLifecycleOnboardingTask User has started (with configurable offset per account type) and hasn't ended Calls provider.provision_user()
UserLifecycleOffboardingTask User has ended Calls provider.deprovision_user()
Custom tasks Based on should_run() logic Arbitrary provider operations

Key behaviors

  • Ranking: tasks are sorted by rank and executed in order (e.g., Google account creation before Calendar provisioning).
  • Pause/Resume: a task can raise PauseTaskException to pause execution and post interactive Slack buttons. An authorized Alaner can then resume or cancel it.
  • Auto-deactivation: after max_fails_before_deactivation consecutive failures (default: 5) within the same batch, the task is automatically deactivated and an alert is posted to #alan_home_alerts.
  • Quiet mode: tasks marked as quiet skip Slack notifications.
  • Dry run: all commands support --dry-run for safe testing.

Adding a new task

Subclass UserLifecycleTask directly (for custom logic), or use UserLifecycleOnboardingTask / UserLifecycleOffboardingTask (for standard provisioning/deprovisioning). Set id, label, channel, rank, and optionally provider, then import it in scheduled_tasks.py. The task auto-registers via subclass discovery.

Providers

A UserLifecycleProvider abstracts all interactions with an external service. Tasks and checks delegate to providers rather than calling external APIs directly — tasks decide when to act, providers handle how.

Interface

Method Purpose
get_all_users() Returns all accounts from the provider, keyed by email. Cached for 20 minutes.
provision_user(user, logger, dry_run) Creates an account for the user
deprovision_user(user_id, logger, dry_run) Removes/suspends the account
update_roles(user, logger, dry_run) Reconciles target vs actual roles. Returns (added, removed).
get_target_roles(user) Computes what provider-specific roles the user should have (from role_mapping)

Role mapping

Each provider declares a role_mapping that connects internal RoleDefinition classes to provider-specific identifiers:

class GoogleProvider(UserLifecycleProvider):
    role_mapping = {
        GoogleSuperAdminRole: {"10526070683992065"},      # Google role ID
        GoogleIamAdminRole: {"10526070683992067", "..."},
    }

class JetbrainsProvider(UserLifecycleProvider):
    role_mapping = {JetbrainsPycharmUser: {"PC"}}         # product code

The base get_target_roles(user) method iterates over this mapping, checks which roles the user holds via has_role() (which queries active role grants), and returns the flattened set of provider-specific identifiers.

Role reconciliation

update_roles() follows the same pattern across providers:

flowchart LR
    A[get_target_roles] --> C[Set diff]
    B[get_all_users → current roles] --> C
    C --> D[target - current = roles to add]
    C --> E[current - target = roles to revoke]
    D --> F[Provider API calls]
    E --> F
Hold "Alt" / "Option" to enable pan & zoom

How tasks use providers

UserLifecycleOnboardingTask and UserLifecycleOffboardingTask directly delegate to the provider set on the task class:

  • Onboarding: run() calls self.provider.provision_user()
  • Offboarding: resume() calls self.provider.deprovision_user()

Custom UserLifecycleTask subclasses can call any provider method — typically get_all_users() in should_run() to check current state, and update_roles() or custom logic in run().

Anomaly Checks

An anomaly check is a transversal verification that runs per-provider (not per-user). It detects accounts that shouldn't exist or are in an invalid state.

Principle

Each check queries a provider for all its accounts, compares them against the set of active Alan users, and flags any unknown or unexpected accounts. Anomalies are posted to Slack with interactive buttons to Ignore, Fix (e.g., deprovision), or Justify (via Alan Home).

flowchart TD
    A[run_checks command] -->|enqueue one job per check| B[check.process_run]
    B --> C[find_anomalies: query provider for all accounts]
    C --> D[Compare against active Alan users]
    D --> E[Track anomalies in DB]
    E --> F[Filter out justified anomalies]
    F --> G{Anomalies remaining?}
    G -->|yes| H[Post to Slack thread with Ignore / Fix / Justify buttons]
    G -->|no| I[Done]
Hold "Alt" / "Option" to enable pan & zoom

Key behaviors

  • Tracking: anomalies are persisted in the ProvisioningAnomaly table. If an anomalous account disappears on the next run, it's automatically marked as resolved.
  • Justification: anomalies can be justified in Alan Home, which excludes them from future reports.
  • Grace period: checks using AccountOffboardingProcess.manual allow a 5-day grace period after a user's last day before flagging their accounts.

Adding a new check

Subclass UserLifecycleProvisioningCheck, implement find_anomalies() and fix_anomaly(), then import it in scheduled_tasks.py.

Semi-Annual Reviews

The review system generates Google Sheets for the bi-annual Role & Permission Review, where designated reviewers audit every provider's accounts.

Commands

Command Purpose
role_review hr_source_of_truth Exports Alaners and External Users from Alan Home to a reference tab
role_review tracked_providers Generates review tabs for providers with automated integration
role_review untracked_providers Generates review tabs for providers with manual CSV exports
role_review get_pending_reviewers Lists reviewers who haven't completed their review
role_review analytics Returns review progress statistics

Data pipeline

flowchart LR
    A[Provider API] --> B[prepare_provider_accounts_for_review]
    B --> C[UserAccountForReview]
    C --> D[categorize_observations_for_review]
    D --> E1[Anomalies]
    D --> E2[To review]
    D --> E3[Not to review]
    E1 --> F[Google Sheets tab per provider]
    E2 --> F
    E3 --> F
Hold "Alt" / "Option" to enable pan & zoom

Each account is categorized based on its AlanMatchStatus:

  • Anomalies: not matched to an active Alaner, or flagged as unexpected
  • To review: matched but has unauthorized roles or unexplained grants
  • Not to review: matched with no issues

Provider types

  • Tracked (icon: 🤖): providers with automated API integration. Account data is fetched live and matched against Alan users.
  • Untracked (icon: 🖐): providers without API integration. Account data comes from CSVs uploaded to a Notion database.

Scheduled commands reference

Command Schedule Description
run_tasks Daily Enqueues lifecycle tasks for all users
run_checks Daily Enqueues anomaly checks for all providers
update_role_grants Daily Reconciles role grants (revoke unknown, update rules, fulfill requests)
update_roles Daily Syncs role definitions from code to DB
update_role_grant_rules Daily Syncs role grant rule definitions from code to DB
notify_of_role_removals Daily Sends Slack DMs for expiring/expired grants
alert_disabled_tasks Daily Posts alert for auto-deactivated tasks
sync_tasks Periodic Syncs task/check metadata to Notion