pvc-plumber โ Cheat Sheet ๐¶
One-page poster. Full intro: pvc-plumber-start-here.md.
๐ Current state (2026-06-01)¶
v4.0.1 permissive ยท 24 PVCs / 18 namespaces managed ยท 24/24 DR_COMPLETE ยท
24/24 DR_COMPLETE ยท 4 restore drills passed ยท Kyverno removed ยท Longhorn 0 faulted/0 degraded/0 rebuilding.
๐ท๏ธ The 3 labels that matter¶
namespace: pvc-plumber.io/managed-namespace: "true" # write-gate
PVC: pvc-plumber.io/enabled: "true" # opt-in
PVC: pvc-plumber.io/tier: "hourly" # cadence (+ manage-volsync: "true")
๐งฑ The 4 systems involved¶
| System | Job |
|---|---|
| Argo | desired state (PVC + labels) from Git |
| pvc-plumber | owns RS/RD wiring + /audit |
| VolSync + Kopia | moves bytes โ RustFS S3 |
| Longhorn | live volume (CSI), snapshots for clones |
๐ 5 questions when debugging¶
- Is the namespace gated? (
managed-namespace=true) - Is the PVC opted in? (
enabled+manage-volsync+tier) - Do RS and RD exist and are both
managed-by=pvc-plumber? - Does the PVC have
dataSourceRef โ <pvc>-dst(else it recreates EMPTY)? - Is the last backup
Successful? โ check/audit(already-matches/stale=false).
๐งช Restore drill quick path¶
sentinel (embed OLD uid + sha256) โ manual RS backup โ RD refresh (new latestImage)
โ scale app to 0 โ delete PVC โ recreate (with dsr!) โ verify sentinel byte-identical
โ scale up โ restore RS schedule + RD restore-once trigger
application.status.sync.revision == dsr commit before deleting (or stale render
recreates it empty). pvc-plumber does not revert your manual trigger patches โ restore them yourself.
๐ฅ Common failure modes¶
| Symptom | Cause / fix |
|---|---|
| PVC recreates empty | no dataSourceRef โ add it (or mark EMPTY_BY_DESIGN) |
ComparisonError ... PVC is invalid: Forbidden |
added dsr to a Bound PVC (immutable) โ clears on delete |
| scale-up sync "Succeeded" but replicas stay 0 | Argo stale cluster cache โ hard-refresh, wait OutOfSync, re-sync |
| double-recreate needed | deleted before render cache caught up โ wait for reconciled rev |
| scheduled backups stopped after a drill | RS left on manual trigger โ restore schedule |
restored volume degraded briefly |
Longhorn replica rebuild โ wait, don't touch replicas |
๐ซ Never-migrate list¶
- CNPG databases โ Barman โ S3 (native).
- PostHog PVCs โ
backup-exempt(disposable). - redis-instance/redis-master-0 โ
backup-exempt(disposable).
โญ๏ธ Next ops tasks¶
- Kopia maintenance โ healthy; full not needed (
docs/domains/storage/kopia-maintenance-plan.md). - Rollback PV cleanup โ 7 retained; reclaim reset-batch first, per-PV approval.
- Longhorn replica/storage policy review (
docs/domains/storage/architecture-future.md). - (future) pvc-plumber v5 strict-mode plan โ not shipped.