Risk heatmap¶
A two-axis ranking of every app: how critical is it (rows), how mature is its current setup (columns). The cells in the upper-left quadrant — high criticality, low maturity — are where to invest first.
Maturity entries below remain
[INFO NEEDED]until each per-app doc is filled in. Criticality is an inference based on what each app does for the business; verify before any final decision. This file is the agenda for Phase 2.
How to read this¶
- Criticality answers: "what happens if this is down or lost?"
- Critical — direct revenue or brand impact, or cascading failure across the estate.
- High — significant revenue / operations impact.
- Medium — productivity / convenience impact, no customer impact.
- Low — small inconvenience.
- Maturity answers: "is this prepared to keep running through reasonable failures?"
- Hobby — no backups, single host, manual everything, secrets in
.envfiles. - Trial — some backups, basic monitoring, manual recovery, secrets mostly in Vault.
- Professional — redundant, backed up and restore-tested, monitored + alerted, secrets in Vault, owner identified, runbook exists.
Where you want to be: professional for everything critical, at minimum.
The heatmap¶
| Criticality ↓ / Maturity → | Hobby (worst) | Trial | Professional (best) |
|---|---|---|---|
| Critical | [Authentik?] [Vault?] [GitLab?] [Wireguard?] |
||
| High | [Parallax?] [Workforce apps?] [CRMs?] [APEX prod?] [MinIO?] [Dokploy?] [Domains?] |
||
| Medium | [WordPress sites?] [Beszel?] [Gotify?] [Portainer?] [Teams Bot?] [n8n?] |
||
| Low | [Open WebUI?] [Coder?] [Draw.io?] [IT Tools?] [PE Tube?] [SQLcl?] [Watchtower?] |
The middle and right columns are intentionally empty — at this point in time we have no evidence that any app has reached "Trial" or "Professional" maturity. The interview / verification pass should move apps rightward as evidence comes in.
Suggested investment order (Phase 2 priority queue)¶
- Vault (Critical) — verify backups, restore-test, document unseal procedure.
- Authentik (Critical) — verify backups + secret-key custody, document recovery, enforce admin MFA.
- GitLab (Critical) — verify backups including
gitlab-secrets.json, restore-test, patch cadence. - Wireguard (Critical) — document break-glass path that doesn't require Wireguard, audit peer list.
- Domain registrars + DNS (Critical) — credentials in Vault, MFA on accounts, registry-lock, calendar reminders.
- MinIO (High) — confirm erasure-coded redundancy, off-MinIO backups for whatever it stores.
- Workforce apps multi-tenancy (High) — confirm SLA exposure, isolate tenants, document customer-impact runbook.
- CRMs (High) — backups + GDPR/DPDP posture + access audit.
- Oracle APEX prod instances (High) — Data Guard for any prod DB, CPU patch schedule, license posture.
- Parallax (High) — interview-block answers, deploy/backup cadence, SPOF analysis.
- Public WordPress sites (Medium-High) — admin MFA + auto-update plugins + WAF + tested backup/restore.
- Dokploy (High) — env-var encryption-key custody, control-plane backup.
- Internal tools (Low/Medium) — bring up the floor (basic backup + monitoring) but don't over-invest.
The investment principle¶
Spend on resilience proportional to criticality × probability-of-failure. Hobby-grade infrastructure on a low-criticality service is fine. Hobby-grade on a critical service is a near-certain incident waiting to happen.
The goal of Phase 2 is to push every critical app into the rightmost column, and at least into "Trial" for everything else.