talos-argocd-proxmox¶
A production-grade GitOps Kubernetes cluster running on Talos OS with
self-managing ArgoCD. ArgoCD manages its own configuration and discovers
applications by directory structure โ no manual Application manifests
needed.
Source repository:
mitchross/talos-argocd-proxmoxThis site is the rendered version of
docs/from that repo. Pages link back to source files (โ๏ธ edit icon, top right) for one-click PRs.[!IMPORTANT] Current pvc-plumber state (2026-06-01): - v4.0.1 live (permissive controller โ not an admission gate) - 24 PVCs / 18 namespaces managed - 24/24 DR_COMPLETE - Kyverno not in the backup path - CNPG native / Barman โ S3 - PostHog backup-exempt ยท redis-instance backup-exempt - migration campaign closed โ no remaining candidates
Stack¶
- OS: Talos Linux on Proxmox VMs, provisioned via Omni / Sidero
- CNI: Cilium with Gateway API + LoadBalancer
- GitOps: ArgoCD (self-managing) + ApplicationSets for auto-discovery
- Storage: Longhorn (RWO block) + TrueNAS/RustFS (Kopia repository on S3)
- Backup: VolSync + Kopia, wired by pvc-plumber v4 (a permissive PVC-watching controller)
- Database: CloudNativePG (Postgres) with Barman backups to RustFS S3
- Secrets: 1Password Connect + External Secrets Operator
- Observability: kube-prometheus-stack, Loki, Tempo, OpenTelemetry, Grafana
- AI: llama-cpp (Qwen3.6-35B-A3B multimodal) on dedicated GPU
Documentation¶
๐ Start here (pvc-plumber)¶
- pvc-plumber-start-here โ visual intro: what it is, the architecture, what it does NOT do, v4-vs-v5.
- pvc-plumber-cheatsheet โ one-page poster.
- pvc-plumber-dynamic-workflow โ how the operator thinks (decision trees,
/auditactions). - talos-argocd-pvc-plumber-integration โ how this repo uses it (add-a-PVC checklist, labels).
๐ ๏ธ Operate the platform¶
- volsync-storage-recovery โ PVC backup/restore single source of truth + restore-drill runbook.
- kopia-maintenance-plan โ repository maintenance (healthy; manual full not needed).
- storage-architecture-future โ Longhorn-vs-restore-DR tiering (future idea).
- storage-model-rwo-rwx-and-sizing โ RWO/RWX decision rule, truenas-csi vs static drivers, per-cluster class map, BigTank sizing.
- pvc-plumber-v4-cutover โ day-of cutover runbook (label model, ownership, rollback).
- pvc-plumber-v4-migration-readiness โ per-PVC migration status (campaign closed).
- cluster-dr-nuke-restore-runbook โ full cluster rebuild/restore runbook.
Bootstrap rules from the full nuke¶
- CRDs first, controllers/apps second, CRs third.
- Observability is optional. Core apps must bootstrap without Prometheus.
- Do not install Prometheus Operator CRDs early to satisfy bootstrap apps.
kube-prometheus-stackremains the sole owner ofmonitoring.coreos.comCRDs.
๐ Design / PRD¶
- pvc-plumber-v4-prd โ locked design + ยง0 canonical status (shipped vs design).
- pvc-plumber-v4-roadmap โ post-PRD backlog.
- pvc-plumber-v5-kopia-native-future โ v5 fork (VolSync-strict vs Kopia-native) โ parked, not built.
- multicluster-prd โ multicluster design.
๐๏ธ Other domains¶
- Databases: cnpg-disaster-recovery ยท cnpg-explained
- GitOps / ArgoCD: argocd ยท argocd-entrypoints
- Networking: network-topology ยท network-policy
- Storage: rustfs-credential-runbook ยท kopia-maintenance-plan ยท storage-architecture-future
- Multicluster: prd ยท handoff notes
- Observability: radar-ng-observability
- AI / GPU: ai-model-catalog ยท 3090-llm-optimization
๐๏ธ Archive (historical only)¶
Historical migration, incident, design, and presentation docs live under
archive/ โ preserved for context, not current runbooks.
Older research and plans remain under research/ and plans/ (also historical).
How to read these docs¶
- Start with the pvc-plumber visual docs for the current operator model.
- Use the storage recovery page for application PVC operations.
- Use the CNPG DR page only for CNPG recovery.
- Use the nuke runbook only for full rebuild planning.
- Treat
archive/,research/, andplans/as historical context.
Adopting any of this¶
This is one operator's homelab, not a product. The patterns are portable, but the specific image tags, hostnames, and 1Password item names are not. Start with VolSync storage recovery and Talos ArgoCD pvc-plumber integration.