Incident: Vault container down — `CAP_SETFCAP` capability error¶

id: INC-2026-05-01-vault
status: resolved
severity: high
opened: 2026-05-01 (approx — "last week, 5 days ago" per Vishnu on 2026-05-06)
resolved: 2026-05-06
detected_by: Vishnu Kant
resolved_by: Vishnu Kant
host: O1
app: vault.448.global ([apps/15-vault.md](../../apps/15-vault.md))
related_kis: [KI-001, KI-004, KI-017, KI-033]
related_runbooks: [RB-003]
fix_applied: Option B variant — SKIP_SETCAP=true env var set (rather than only adding SETFCAP capability)
duration: ~5 days

Summary¶

The Vault container on O1 was failing to start for ~5 days. Symptom was the container exiting immediately at startup with the error message:

unable to set CAP_SETFCAP effective capability: Operation not permitted
unable to set CAP_SETFCAP effective capability: Operation not permitted

(printed twice, once for each attempt the entrypoint makes.) Container did not become healthy; Vault never reached the listening state; nothing was sealed because the process never started.

Resolved 2026-05-06 by reconstructing the lost compose file, pinning the Vault image to 1.18.5, adding IPC_LOCK and SETFCAP to cap_add, and setting SKIP_SETCAP=true. The compose file now lives in source control at infra/vault/docker-compose.yml.

Impact¶

CI/CD pipelines that read from Vault are broken. The n8n workflows that depend on Vault-stored secrets (per apps/25-n8n.md) cannot resolve credentials.
No new credentials can be stored. Anything that was about to be put in Vault is blocked.
No customer-facing app is currently impacted. Apps that read secrets at startup and were already running continue with the secrets they have in memory.
No data loss. The Vault Raft data is intact on disk; only the running process is affected.

Root cause analysis¶

Two compounding factors:

Factor 1 — Image churn from Watchtower. Watchtower on O1 (apps/22-watchtower.md) auto-pulled a newer hashicorp/vault image. The new image's entrypoint script attempts setcap cap_ipc_lock=+ep against the vault binary so it can later use mlockall() to prevent secrets being paged to swap. This pattern — Watchtower silently pulling a breaking image — is the inverse of the risk captured in KI-004 (auto-updates that DID happen, breaking a Tier-0 service). The compose-recorded image hash (874fcc93e952…) and the running image hash (91269baab732…) differ, confirming the post-pull change.

Factor 2 — Non-root user can't use CAP_SETFCAP even when granted. The Vault image's Dockerfile sets USER vault (non-root, uid 100). Adding CAP_SETFCAP to the container's cap_add puts it in the bounding set, but Docker doesn't grant ambient capabilities to non-root users by default. The vault user therefore cannot actually call setcap on the binary, and the entrypoint logs unable to set CAP_SETFCAP effective capability: Operation not permitted and exits.

This was the trap during recovery: simply adding SETFCAP to cap_add was not enough. The fix required either: - Setting SKIP_SETCAP=true env var so the entrypoint skips the setcap step entirely (chosen — Option B in the original options below), or - Running the container as root (user: "0:0") and letting the entrypoint drop privileges via su-exec.

Note for future runbooks: "Add the missing capability" is necessary but not sufficient when the container runs as a non-root user. Always pair cap_add with either SKIP_SETCAP=true (Vault-specific) or user: "0:0" (general).

Diagnostic checks¶

Before applying a fix, capture state for the post-incident review:

# On O1:
sudo docker ps -a --filter name=vault
sudo docker logs --tail=50 <vault-container-id>
sudo docker inspect <vault-container-id> --format '{{.HostConfig.CapAdd}} / {{.HostConfig.CapDrop}}'
sudo docker image inspect hashicorp/vault:<tag> --format '{{.Created}} / {{.Id}}'

# Confirm host runtime versions:
docker version
runc --version 2>/dev/null || true
uname -r

# Confirm Watchtower's recent activity:
sudo docker logs <watchtower-container-id> --tail=200 | grep -i vault

Fix options¶

Option A — Add the missing capability (cleanest)¶

Add SETFCAP (and IPC_LOCK if not already present) to the Vault container's capability set.

If using docker run:

docker run -d --name vault \
  --cap-add=SETFCAP \
  --cap-add=IPC_LOCK \
  -p 8200:8200 \
  -v /opt/vault/config:/vault/config \
  -v /opt/vault/data:/vault/data \
  hashicorp/vault:<pinned-version> \
  server

If using docker-compose.yml:

services:
  vault:
    image: hashicorp/vault:<pinned-version>
    cap_add:
      - IPC_LOCK
      - SETFCAP
    ports:
      - "8200:8200"
    volumes:
      - /opt/vault/config:/vault/config
      - /opt/vault/data:/vault/data
    command: server

After applying, restart the container and verify:

sudo docker logs <vault-container-id> --tail=20
# Expect: "Vault server starting" then "core: post-unseal setup complete"
curl -sk https://vault.448.global/v1/sys/seal-status | jq
# Expect: {"sealed": true, ...} -- Vault is up but sealed; unseal per RB-003 Path A.

Option B — Skip `setcap` entirely (quick workaround)¶

If Option A doesn't work for any reason, set SKIP_SETCAP=true in the container env. Vault will run without IPC_LOCK, meaning memory pages containing secrets can be swapped to disk.

Acceptable for a small instance with minimal memory pressure (Free Tier Ampere A1 is unlikely to swap), but it's a security trade-off.

services:
  vault:
    image: hashicorp/vault:<pinned-version>
    environment:
      - SKIP_SETCAP=true

Option C — Pin to a known-good older image¶

If you can identify the previous Vault image tag that was running before the failure, pin to it:

sudo docker logs <watchtower-container-id> --tail=500 | grep -A2 "hashicorp/vault"
# Find the tag that was in use before the change.

Then update the Vault container to that pinned tag and exclude it from Watchtower auto-updates (label-opt-out).

Post-fix verification¶

Once Vault is up:

Unseal it per RB-003 Path A — get 3 of 5 unseal-key holders to apply their shares.
Verify a secret read works: vault kv get secret/<known-path>.
Restart the n8n CI/CD workflows that were broken; confirm they resolve secrets.
Snapshot Vault immediately after recovery (if RM-013 hasn't landed yet, do an ad-hoc snapshot now): vault operator raft snapshot save /tmp/vault-post-incident.snap and ship to OCI bucket.

Lessons learned (post-incident)¶

Watchtower auto-update on Tier-0 systems is high-risk; Vault is now opted out (com.centurylinklabs.watchtower.enable=false label). Apply the same pattern to Authentik and any other Tier-0 service on O1.
cap_add alone doesn't help non-root containers — the container's USER directive determines whether ambient caps can be used. For HashiCorp images that follow the "non-root by default" pattern, pair cap_add with SKIP_SETCAP=true or run as root. Document this in any future runbook that touches container caps.
Vault was down for 5 days without breaking customer apps — most apps read Vault at startup and cache the result, so Vault outages are silent until something restarts. Reinforces the case for RM-039 (alert delivery rebuild) and RM-038 (external uptime monitor) — without active alerting this could have stretched indefinitely.
The lost compose file at /data/compose/86/ revealed KI-015 in real terms. Reconstructing from docker inspect worked but was an avoidable hour of work. The compose file now lives at infra/vault/docker-compose.yml — Vault is the first service to close the config-on-host-only gap. Authentik, Caddy on E1, Caddy on O1, n8n, and SQLcl are the natural follow-ups.
The trade-off documented: SKIP_SETCAP=true means Vault cannot mlockall(), so secret material can be paged to swap under memory pressure. Acceptable on a Free A1 with low memory pressure; revisit if O1 is upgraded to a paid shape.

Action items spawned by this incident¶

Vault recovered with SKIP_SETCAP=true (Option B variant).
Vault opted out of Watchtower auto-updates (label applied).
Vault image pinned to hashicorp/vault:1.18.5.
Compose file committed at infra/vault/docker-compose.yml.
Off-host data backup taken (2026-05-06) — full tarball of /home/ubuntu/docker/vault/ uploaded to OCI Object Storage at PECommon/infra/vault.448.global/vault-data-backup-2026-05-06.tar.gz. Bucket confirmed private (anonymous access returns 404). This is the first off-host backup of any Tier-0 system in the estate.
Raft snapshot captured (148 KB) via API; sha256 a453fcf47d2c74ca5e5f1ec6cae3d6bdff672d7832dc57ff4cdc684b224fcbb1 — currently on O1 only; recommended to also push to OCI bucket alongside the tarball under infra/vault.448.global/raft-snapshots/ for parity.
⏳ Apply the same Watchtower-exclusion pattern to other Tier-0 services on O1 (esp. Authentik, MinIO).
⏳ Promote this lesson to a Phase-2 action: review Watchtower's scope on O1 and define which services are auto-updateable vs explicitly pinned.
Leaked Vault token revoked (2026-05-06). Token hvs.vOdm… had been pasted into chat / shell history during snapshot capture. Revoked + bash history scrubbed. Lesson now permanently captured in RB-003 Path D Step D5: never pass Vault tokens on the command line; use vault login to set them via env, then vault token revoke -self when done.

KI-001 — config not in Git (similar failure pattern)
KI-004 — auto-update strategy gap (this incident is the inverse on O1, where updates DID happen and broke things)
KI-017 — no Vault snapshot off-host (would have made recovery worse if data had been lost)
KI-033 — this incident as a tracked KI
RB-003 — Vault recovery runbook (Path D added for this scenario)
apps/15-vault.md — Vault app doc

Incident: Vault container down — CAP_SETFCAP capability error¶