Skip to content

talos-argocd-proxmox

A production-grade GitOps Kubernetes cluster running on Talos OS with self-managing ArgoCD. ArgoCD manages its own configuration and discovers applications by directory structure โ€” no manual Application manifests needed.

Source repository: mitchross/talos-argocd-proxmox

This site is the rendered version of docs/ from that repo. Pages link back to source files (โœ๏ธ edit icon, top right) for one-click PRs.

[!IMPORTANT] Current pvc-plumber state (2026-06-01): - v4.0.1 live (permissive controller โ€” not an admission gate) - 24 PVCs / 18 namespaces managed - 24/24 DR_COMPLETE - Kyverno not in the backup path - CNPG native / Barman โ†’ S3 - PostHog backup-exempt ยท redis-instance backup-exempt - migration campaign closed โ€” no remaining candidates

Stack

  • OS: Talos Linux on Proxmox VMs, provisioned via Omni / Sidero
  • CNI: Cilium with Gateway API + LoadBalancer
  • GitOps: ArgoCD (self-managing) + ApplicationSets for auto-discovery
  • Storage: Longhorn (RWO block) + TrueNAS/RustFS (Kopia repository on S3)
  • Backup: VolSync + Kopia, wired by pvc-plumber v4 (a permissive PVC-watching controller)
  • Database: CloudNativePG (Postgres) with Barman backups to RustFS S3
  • Secrets: 1Password Connect + External Secrets Operator
  • Observability: kube-prometheus-stack, Loki, Tempo, OpenTelemetry, Grafana
  • AI: llama-cpp (Qwen3.6-35B-A3B multimodal) on dedicated GPU

Documentation

๐Ÿš€ Start here (pvc-plumber)

  1. pvc-plumber-start-here โ€” visual intro: what it is, the architecture, what it does NOT do, v4-vs-v5.
  2. pvc-plumber-cheatsheet โ€” one-page poster.
  3. pvc-plumber-dynamic-workflow โ€” how the operator thinks (decision trees, /audit actions).
  4. talos-argocd-pvc-plumber-integration โ€” how this repo uses it (add-a-PVC checklist, labels).

๐Ÿ› ๏ธ Operate the platform

Bootstrap rules from the full nuke

  • CRDs first, controllers/apps second, CRs third.
  • Observability is optional. Core apps must bootstrap without Prometheus.
  • Do not install Prometheus Operator CRDs early to satisfy bootstrap apps.
  • kube-prometheus-stack remains the sole owner of monitoring.coreos.com CRDs.

๐Ÿ“ Design / PRD

๐Ÿ—ƒ๏ธ Other domains

๐Ÿ—„๏ธ Archive (historical only)

Historical migration, incident, design, and presentation docs live under archive/ โ€” preserved for context, not current runbooks. Older research and plans remain under research/ and plans/ (also historical).

How to read these docs

  • Start with the pvc-plumber visual docs for the current operator model.
  • Use the storage recovery page for application PVC operations.
  • Use the CNPG DR page only for CNPG recovery.
  • Use the nuke runbook only for full rebuild planning.
  • Treat archive/, research/, and plans/ as historical context.

Adopting any of this

This is one operator's homelab, not a product. The patterns are portable, but the specific image tags, hostnames, and 1Password item names are not. Start with VolSync storage recovery and Talos ArgoCD pvc-plumber integration.