Lab dry-run runbook¶
This runbook walks an engineer through a one-evening end-to-end validation of
message-center-monitor v2.5.0 against a non-prod Microsoft Entra tenant +
Power Platform environment. After the smoke test passes, the engineer has
verified — against live wires — that the v2.4.0 fix for the C1 admin-field
clobber regression actually holds.
Status: Internal lab tooling. NOT for production deployment. Production deployment uses
docs/setup-checklist.md.
Audience & prerequisites¶
You are an M365 administrator or DevOps engineer with:
- A non-prod Microsoft Entra tenant + Power Platform environment.
- Application Administrator + Power Platform Administrator roles in
that tenant. (The prereq script can confirm via
-CheckRoles.) - Local workstation with PowerShell 7+, Python 3.10+, GitHub CLI.
- Azure subscription (for the lab Key Vault) where you have Contributor.
- An empty resource group OR permission to create one.
Council finding traceability¶
This lab gates the following findings from the v2.4.0 verification council:
| Finding | Bug class | Fix file | Unit test | Smoke step (lab) |
|---|---|---|---|---|
| C1 | Update clobbers admin assessments | _Common.ps1 Invoke-McmDvUpsertMessage |
Upsert.Tests.ps1 (8 cases) |
Steps 5–6 (manual flip → all 7 admin cols) |
| H1 | WorkloadIdentity token exchange | _Common.ps1 Get-McmAccessToken |
Common.Tests.ps1 auth-mode dispatch |
n/a (lab uses ClientSecret per A3) |
| H2 | Retry on 5xx | _Common.ps1 Invoke-McmRest |
Common.Tests.ps1 retry suite |
implicit (live network during sync) |
| H3 | Retry-After parsing on PS7 |
_Common.ps1 Invoke-McmRest |
Common.Tests.ps1 Retry-After |
implicit |
| Schema | Idempotent alt-key creation | create_mcm_dataverse_schema.py create_keys() |
test_schema.py |
Step 3 (deploy) + Steps 3-6 (uses alt-key URL) |
One-time setup¶
cd message-center-monitor/lab
cp lab-config.example.json lab-config.json
# Fill in lab-config.json with your tenant/subscription/environment details.
Then install prereqs (will prompt for Microsoft.Graph + Az + PowerApps modules if missing):
If -CheckRoles warns that you lack a required directory role, fix that
before continuing — every step after this assumes you have it.
Non-prod safety acknowledgement¶
Before any mutating script (01-06, 99) will run, lab-config.json must
contain the literal string in nonProd.acknowledgement:
The check is case + punctuation sensitive. A typo, a generic "yes", or an
empty value all fail the guard. To deliberately re-validate against a production
tenant (NOT recommended), pass -AllowProduction to each script — it is a loud
log line, not silence.
Execution order¶
Run each script in order. Each is idempotent — safe to re-run on the same state.
| # | Script | Time | What it does |
|---|---|---|---|
| 0 | 00_Install-Prereqs.ps1 |
~3 min | Installs PS modules + pip packages; verifies runtimes; (opt) checks roles |
| 1 | 01_New-AppRegistration.ps1 |
~30 s | Creates app reg + SP + permissions; admin-consents; rotates secret if stale |
| 2 | 02_New-KeyVault.ps1 |
~1 min | Creates Key Vault + RBAC role; uploads the secret from step 1 |
| 3 | 03_Deploy-Schema.ps1 |
~5 min | Runs the 3 Python setup scripts; polls alt-key for Active up to 15 min |
| 4 | 04_New-AppUser.ps1 |
~1 min | Creates Dataverse app user; assigns role; probes effective access |
| 5 | 05_Set-EnvVarValues.ps1 |
~30 s | Populates the 6 fsi_MCM_* Dataverse environment variable values |
| 6 | 06_Invoke-LabSmokeTest.ps1 |
~3 min | The dry-run. 10-step orchestrator including manual cloud-flow gate. |
| 99 | 99_Remove-LabDeployment.ps1 |
~3 min | State-driven teardown. Run when you're done with the lab. |
Step 6 explained: why all 7 admin columns?¶
The v2.4.0 fix (council finding C1) was that the inline upsert PATCH body
included fsi_assessmentstatus on every update — which clobbered any value
an admin had set during their assessment workflow. The fix re-routes update
PATCH bodies through Invoke-McmDvUpsertMessage, which excludes ALL admin-owned
columns from the update payload (admin-owned = anything a human sets after the
machine writes the row).
The 7 admin-owned columns are:
fsi_assessmentstatusfsi_assessmentfsi_assessedbyfsi_assesseddatefsi_actionstakenfsi_impactsagentsfsi_notifiedon
Upsert.Tests.ps1 asserts via JSON-body set intersection that none of these
appear in update payloads. Step 6 of the lab smoke test does the same
end-to-end against a real Dataverse row: set all 7, run the sync, then
read back the row and assert all 7 are unchanged.
Troubleshooting¶
| Symptom | Likely cause | Fix |
|---|---|---|
00_Install-Prereqs.ps1 fails on Install-Module |
TLS / proxy / no PSGallery trust | Set-PSRepository PSGallery -InstallationPolicy Trusted; [Net.ServicePointManager]::SecurityProtocol = 'Tls12'; retry |
01_New-AppRegistration.ps1 fails: "Insufficient privileges" |
Missing Application Administrator | Assign role in Microsoft Entra admin center > Roles & administrators |
01 finishes but Step 6 of consent poll times out |
Tenant-wide consent backlog | Wait 5 min and re-run 01 (it is idempotent and re-checks consent) |
03_Deploy-Schema.ps1 polls alt-key and reports status=Failed |
Alt-key index activation failed | Power Apps maker portal > Tables > Message Center Log > Keys; delete + recreate, then re-run 03 |
03 polls alt-key and reports status=Pending past schemaKeyActivationMaxSeconds |
Slow tenant; large existing data | Bump thresholds.schemaKeyActivationMaxSeconds in lab-config.json and re-run 03 |
04_New-AppUser.ps1 step "probe" fails after 6 attempts |
Role association did not propagate | Power Apps maker portal > Users + permissions > Application users; verify role; re-run 04 |
| Step 3 of smoke test reports "No rows in fsi_messagecenterlogs" | Tenant has no recent MC posts | Bump smoke.daysBack to 90 in lab-config.json and re-run 06 |
| Step 6 of smoke test FAILS with "ADMIN FIELDS CLOBBERED" | C1 regression has resurfaced | STOP. Do NOT ship. Re-open PR #40 thread; investigate Invoke-McmDvUpsertMessage |
Re-running the lab¶
All scripts are idempotent. To re-run from scratch without tearing down:
To start over:
Out of scope¶
- Production deployment — use
docs/setup-checklist.md. - Real Teams notification validation — the lab runs headless (empty TeamId/Channel are accepted). Verify Teams manually after the lab if needed.
- Logic Apps Standard alternative — this lab uses the existing PowerShell-driven sync; a Logic App variant could be added in a future release.
Files in this folder¶
lab/
├── 00_Install-Prereqs.ps1 # Step 0 — modules + pip + role check
├── 01_New-AppRegistration.ps1 # Step 1 — app reg + consent + secret rotation
├── 02_New-KeyVault.ps1 # Step 2 — Key Vault + secret upload + RBAC
├── 03_Deploy-Schema.ps1 # Step 3 — Python schema + alt-key polling
├── 04_New-AppUser.ps1 # Step 4 — Dataverse app user + role + probe
├── 05_Set-EnvVarValues.ps1 # Step 5 — Dataverse env-var values
├── 06_Invoke-LabSmokeTest.ps1 # Step 6 — 10-step end-to-end smoke test
├── 99_Remove-LabDeployment.ps1 # Step 99 — state-driven idempotent teardown
├── lab-config.example.json # Template — copy to lab-config.json
├── lab-state.schema.json # Schema for the ownership manifest
├── .gitignore # Excludes lab-config.json, lab-state.json, logs/
└── lib/
└── Write-LabLog.ps1 # Shared logger with secret redaction