Orchestrator architecture¶

1. Purpose¶

deploy.ps1 is the single entry point for rebuilding the agent-intake solution from a clean tenant into a validated lab or first customer deployment. It coordinates the schema, solution shell, identity and records setup, maker surface checks, reviewer app provisioning, policy hydration, smoke testing, and optional seeded test data in one repeatable workflow.

The orchestrator is intentionally idempotent. Re-running it after a partial failure helps the operator resume from the current state instead of tearing down and starting over.

2. Stages¶

flowchart TD
    A[Stage 0: Preflight] --> B[Stage 1: Schema]
    B --> C[Stage 2: Solution shell]
    C --> D[Stage 3: Identity and records]
    D --> E[Stage 4: Maker surface]
    E --> F[Stage 5: Reviewer app]
    F --> G[Stage 6: Policy hydration]
    G --> H[Stage 7: Smoke test]
    H --> I{SeedTestData?}
    I -- No --> J[Complete]
    I -- Yes --> K[Stage 8: Seed test data]
    K --> L[Seeded smoke final gate]
    L --> J[Complete]

Stage 0 - Preflight¶

Validates PowerShell, Python, PAC CLI, Azure CLI, and the required PowerShell modules.
Verifies Python packages needed by the child scripts.
Establishes Azure CLI authentication based on -AuthMode.
Creates or selects the PAC auth profile for the target environment.
Confirms the Dataverse endpoint is reachable by calling WhoAmI.

Stage 1 - Schema¶

Runs create_fsi_intake_dataverse_schema.py against the target environment.
Refreshes docs/dataverse-schema.md from the same schema source.
Verifies all nine tables, key columns, and the global option sets before continuing.

Stage 2 - Solution shell¶

Runs provision_solution_shell.ps1.
Verifies the unmanaged solution shell, environment variable definitions, and standard connection references.
Accepts the Graph custom-connector API ID through AGENT_INTAKE_GRAPH_CUSTOM_CONNECTOR_API_ID when the customer already knows it.

Stage 3 - Identity and records¶

Runs setup_purview_retention_label.py, which uses the PowerShell wrapper on Windows.
Runs setup_agent_identity_blueprint.py and setup_entra_agent_id.py --check-consent.
Treats Purview delegated read limits and tenant feature-gating for Agent ID as warnings when the child scripts fall back cleanly.

Stage 4 - Maker surface¶

Runs provision_power_pages.ps1.
Validates whether PAC CLI can see the agent-intake site.
Emits MANUAL STEP REQUIRED: guidance when PAC CLI reaches a classic Power Pages gap that still needs the maker portal.

Stage 5 - Reviewer app¶

Runs provision_reviewer_app.ps1.
Verifies the reviewer solution, app module, and reviewer security roles.

Stage 6 - Policy hydration¶

Reads templates/policy-lookup-tables.yaml.
Writes the current values for the fsi_intake_* environment variables.
Uses environment overrides when available and otherwise falls back to derived values or documented placeholders.

Stage 7 - Smoke test¶

Runs smoke_test.ps1 unless -SkipSmoke is specified.
Fails the deploy if a required smoke check fails.

Stage 8 - Seed test data¶

Runs seed-test-data.ps1 -RunClassifierInline when -SeedTestData is specified.
Validates that all five deterministic request rows exist.
Re-runs smoke_test.ps1 -IncludeSeededDataChecks as the final gate unless -SkipSmoke was specified.

3. Idempotency model¶

Each stage follows the same loop:

Read the current state.
Compute the delta between the target state and the current state.
Apply only the missing or changed state.
Re-read and validate before the next stage starts.

This pattern is implemented at two levels:

Child script level: the schema, shell, reviewer app, retention label, and MRM bridge scripts are already idempotent or best-effort idempotent.
Orchestrator level: deploy.ps1 re-checks the resulting objects after every child script and skips unchanged environment-variable values.

The same design is used for -Teardown. The script removes seeded data first, then deletes or clears each footprint area in reverse order, and records anything that still needs manual follow-up.

4. Teardown semantics¶

deploy.ps1 -Teardown is intentionally defensive.

Automatically removed¶

Seeded lab data through seed-test-data.ps1 -Cleanup
All rows in the nine agent-intake Dataverse tables
Reviewer app module and reviewer security roles
Agent-intake environment variable definitions and current values
Standard connection references created by the solution shell
Best-effort deletion of the custom tables and global option sets
Best-effort deletion of the FSIAgentIntake and AgentIntakeReviewerApp solution containers

Left in place on purpose or by platform gap¶

Power Pages classic site and page binding: PAC CLI cannot remove every classic Power Pages artifact reliably, so the script records a manual fallback.
Purview retention labels: the script preserves FSI-AgentIntake-7yr and FSI-AgentIntake-7yr-WORM because those labels can be shared across records outside the solution.
Microsoft Entra Agent Identity blueprint: the script treats blueprint cleanup as a manual decision because the same blueprint can be shared across multiple agents.
Live Microsoft Entra Agent IDs minted during seeded tests: the default seeded lab path uses deterministic synthetic IDs, but if the customer opts into live minting, the operator should remove those service principals manually after confirming no downstream dependency exists.

5. Exit-code matrix¶

Exit code	Meaning
`0`	Success
`10`	Preflight failure
`20`	Schema failure
`30`	Solution shell failure
`40`	Identity and records failure
`50`	Maker surface failure
`60`	Reviewer app failure
`70`	Policy hydration failure
`80`	Smoke failure
`90`	Seed failure
`100`	Teardown failure

6. Auth modes¶

`AzCli` (default)¶

Uses delegated Azure CLI sign-in.
If no cached context exists, the orchestrator runs az login --use-device-code.
Uses pac auth create --deviceCode when PAC is not already connected.

`ManagedIdentity`¶

Uses Azure CLI managed-identity sign-in (az login --identity).
Uses pac auth create --managedIdentity when PAC is not already connected.
This is the preferred non-interactive deployment path for Azure-hosted automation.

`ServicePrincipal` (legacy)¶

Uses AZURE_TENANT_ID, AZURE_CLIENT_ID, and AZURE_CLIENT_SECRET.
Uses Azure CLI service-principal sign-in and PAC service-principal auth creation.
This path is retained for legacy labs only. Managed identity remains the recommended production approach.

7. Customer override points¶

The orchestrator intentionally keeps the override surface small and explicit.

Environment variables¶

These values can be injected without editing the scripts:

AGENT_INTAKE_MAKER_PORTAL_URL
AGENT_INTAKE_REVIEWER_APP_URL
AGENT_INTAKE_MRM_TARGET_ENV
AGENT_INTAKE_DRIFT_DETECTOR_ENV
AGENT_INTAKE_RETENTION_LABEL_ID
AGENT_INTAKE_SPONSOR_BACKUP_GROUP
AGENT_INTAKE_GRAPH_CUSTOM_CONNECTOR_API_ID
AGENT_INTAKE_PURVIEW_ADMIN_UPN
AGENT_INTAKE_BLUEPRINT_SPONSOR_UPN
AGENT_INTAKE_AGENT_BLUEPRINT_ID
AGENT_INTAKE_LIVE_AGENT_ID

Policy YAML¶

templates/policy-lookup-tables.yaml remains the source for retention labels, quorum policy, MRM routing defaults, reviewer routing defaults, and sponsor attestation text.

Fixture data¶

The deterministic lab fixtures live under scripts/seed-test-data/. Customers can fork those JSON files when they want different sample data, but the shipped defaults cover the Express happy path, a Standard approval with conditions, a Full reviewer board, cross-border default-deny, and sponsor self-approval default-deny.

8. Failure recovery¶

Stage 0 through Stage 6 failures: fix the underlying prerequisite or permissions issue and rerun deploy.ps1. Earlier successful stages are safe to repeat.
Stage 7 smoke failure: correct the failing configuration, then rerun deploy.ps1. The smoke test is read-only.
Stage 8 seed failure: rerun deploy.ps1 -SeedTestData after fixing the root cause. The seeder removes and recreates the deterministic fixture set.
Teardown partial failure: review the teardown summary table, complete any manual fallback items, and rerun deploy.ps1 -Teardown if needed.

The operational rule is simple: prefer an idempotent re-run over hand-editing partial state. Use -Teardown only when the operator wants to remove the whole agent-intake footprint and rebuild from a clean baseline.