VPA Resource Optimization Guide
How to use VPA to right-size Kubernetes resource requests based on actual workload behavior.
TL;DR — Just Tell Me What To Do
Section titled “TL;DR — Just Tell Me What To Do”Everything is automatic. A Kyverno ClusterPolicy auto-creates VPA resources for every workload in the cluster. Infrastructure/monitoring namespaces get updateMode: "Off" (recommend only). User app namespaces get updateMode: "Initial" (sets optimal resources at pod creation).
Step 1: Check recommendations
Section titled “Step 1: Check recommendations”# Human-readable VPA report./scripts/vpa-report.sh
# Filter to one namespace./scripts/vpa-report.sh argocd
# Or raw kubectl one-linerkubectl get vpa -A -o custom-columns=\NS:.metadata.namespace,\NAME:.metadata.name,\CPU:.status.recommendation.containerRecommendations[0].target.cpu,\MEM:.status.recommendation.containerRecommendations[0].target.memoryStep 2: Open Grafana VPA dashboard
Section titled “Step 2: Open Grafana VPA dashboard”Go to https://grafana.vanillax.me and search for “VPA”. The dashboard shows time-series graphs of VPA recommendations with historical trends.
Step 3: Look for problems
Section titled “Step 3: Look for problems”Look for:
- Current request way below “Target” = pod is starved, increase it
- Current request way above “Target” = wasting resources, decrease it
- Current request below “Lower Bound” = pod is actively throttled, fix ASAP
Step 4: Apply changes (infrastructure only)
Section titled “Step 4: Apply changes (infrastructure only)”Infrastructure namespaces use updateMode: "Off" — edit the app’s values.yaml in Git, update the resources: block, push, ArgoCD applies it. Add a comment explaining why:
# VPA-optimized (2026-02-28) — target was 2000m, previous 500mresources: requests: cpu: 2000m memory: 1GiUser app namespaces use updateMode: "Initial" — VPA sets optimal resources when pods are created. No manual intervention needed.
Step 5: Wait and re-check
Section titled “Step 5: Wait and re-check”VPA recommendations update continuously. Check back in a week to see if the new values are good. Don’t change things daily.
Architecture
Section titled “Architecture”kubelet /metrics/resource │ ▼metrics-server (provides metrics.k8s.io API) │ ▼VPA Recommender (reads metrics, writes recommendations to VPA .status) ▲ │Kyverno ClusterPolicy (vpa-auto-generate) │ • watches Deployments, StatefulSets, DaemonSets │ • auto-creates VPA per workload │ • infra/monitoring namespaces → updateMode: "Off" │ • user app namespaces → updateMode: "Initial" │ • GPU workloads → updateMode: "Off" ▼VPA resources (one per workload) │ ├─ Infra namespaces: recommend-only (manual review) └─ App namespaces: set at creation (Initial) │ ▼Human reviews infra → updates values.yaml → Git push → ArgoCD appliesVPA Admission Controller sets resources at pod creation → no human intervention neededKyverno is the sole VPA creator. The vpa-auto-generate ClusterPolicy watches all workloads and generates VPA resources automatically. No manual VPA manifests needed.
Components
Section titled “Components”| Component | Chart | Namespace | Location |
|---|---|---|---|
| metrics-server | metrics-server/metrics-server | kube-system | infrastructure/controllers/metrics-server/ |
| VPA | kubernetes.github.io/autoscaler | vertical-pod-autoscaler | infrastructure/controllers/vertical-pod-autoscaler/ |
| Kyverno VPA policy | — | kyverno | infrastructure/controllers/kyverno/policies/vpa-auto-generate.yaml |
metrics-server and VPA are deployed via the Infrastructure ApplicationSet (Wave 4). The Kyverno policy is deployed as part of Kyverno (Wave 3).
VPA Sub-Components
Section titled “VPA Sub-Components”| Component | Purpose |
|---|---|
| Recommender | Analyzes metrics, generates recommendations |
| Updater | Applies changes when mode is not Off (evicts or in-place resizes) |
| Admission Controller | Sets resources on new pods when mode is not Off |
Update Modes by Namespace
Section titled “Update Modes by Namespace”| Namespace Type | Update Mode | Behavior |
|---|---|---|
| Infrastructure (argocd, cilium, etc.) | Off | Recommend only — manual GitOps workflow |
| Monitoring (prometheus-stack, loki-stack, etc.) | Off | Recommend only — manual GitOps workflow |
| GPU workloads (runtimeClassName: nvidia) | Off | Recommend only — VPA can’t manage GPU resources |
| User apps (everything else) | Initial | Set optimal resources at pod creation |
CLI Tools & Scripts
Section titled “CLI Tools & Scripts”vpa-report.sh
Section titled “vpa-report.sh”The scripts/vpa-report.sh script provides a formatted table of all VPA recommendations with human-readable values.
# All namespaces./scripts/vpa-report.sh
# Single namespace./scripts/vpa-report.sh argocdExample output:
========================================== VPA Resource Recommendations Report==========================================
NAMESPACE WORKLOAD CONTAINER CPU TGT CPU RANGE MEM TGT MEM RANGE-------------------------------------------------------------------------------------------------------------------------------------------------argocd Deployment/argocd-server server 23m 12m-100m 175Mi 88Mi-700Miargocd Deployment/argocd-repo-server repo-server 2975m 1488m-11900m 523Mi 262Mi-2.0Gi...
Total: 42 containers with VPA recommendations
Action needed if your current request is: < lowerBound → INCREASE NOW (pod is being throttled) < target → INCREASE (under-provisioned) ≈ target → KEEP (well-tuned) > 2x target → DECREASE (over-provisioned)kubectl One-Liners
Section titled “kubectl One-Liners”# Quick overview: all VPA targets across the clusterkubectl get vpa -A -o custom-columns=\NAMESPACE:.metadata.namespace,\NAME:.metadata.name,\CPU:.status.recommendation.containerRecommendations[0].target.cpu,\MEM:.status.recommendation.containerRecommendations[0].target.memory
# Detailed view for a specific namespacekubectl get vpa -n argocd -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{range .status.recommendation.containerRecommendations[*]}{" "}{.containerName}{": cpu="}{.target.cpu}{" mem="}{.target.memory}{"\n"}{end}{end}'
# Full detail for a specific VPAkubectl describe vpa <name> -n <namespace>
# Current resource usage vs requests (side-by-side comparison)kubectl top pods -n <namespace>kubectl get deploy <name> -n <ns> -o jsonpath='{.spec.template.spec.containers[0].resources}'kubectl get vpa <name> -n <ns> -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'Reading Recommendations
Section titled “Reading Recommendations”The Four VPA Values
Section titled “The Four VPA Values”VPA recommendations include four values per container:
| Value | Meaning | Use For |
|---|---|---|
| lowerBound | Minimum to avoid throttling/OOM | Red flag if current request is below this |
| target | Optimal request based on observed usage | Set requests: to this value |
| upperBound | Peak observed consumption | Informs limits: setting |
| uncappedTarget | Ideal ignoring any VPA min/max constraints | Same as target when no constraints are set |
Memory values are in bytes. Quick conversions:
104857600= 100Mi268435456= 256Mi536870912= 512Mi1073741824= 1Gi1610612736= 1.5Gi
Decision Matrix
Section titled “Decision Matrix”| Situation | Action | Priority |
|---|---|---|
| Current request < lowerBound | INCREASE NOW | Pod is being throttled or OOM-killed |
| Current request < target | INCREASE | Under-provisioned, degraded performance |
| Current request within 20% of target | KEEP | Already well-tuned |
| Current request > 1.5x target | DECREASE | Over-provisioned, wasting resources |
| Current request > 5x target | DECREASE | Heavily over-provisioned |
Timing
Section titled “Timing”- Wait at least 7 days before trusting VPA numbers. Initial recommendations are noisy.
- Review weekly, not daily. Over-correcting defeats the purpose.
- Re-check after major changes (new features, traffic spikes, version upgrades). VPA is backward-looking.
- Upper bounds stabilize over ~14 days. They’ll be very wide initially.
Applying Changes (GitOps Workflow)
Section titled “Applying Changes (GitOps Workflow)”For Infrastructure Namespaces (updateMode: Off)
Section titled “For Infrastructure Namespaces (updateMode: Off)”- Read the VPA recommendation (
./scripts/vpa-report.shor Grafana dashboard) - Update the app’s
values.yamlwith new resource requests - Add a comment documenting the VPA data and reasoning:
# VPA-optimized (YYYY-MM-DD)# VPA target: cpu Xm, memory Y# Previous: cpu Am (reason for change)resources: requests: cpu: Xm # Match VPA target memory: Y # Match VPA target + buffer limits: cpu: 2Xm # 2x request for burst memory: 2Y # 2x request for spikes- Git commit and push — ArgoCD applies via GitOps
For User App Namespaces (updateMode: Initial)
Section titled “For User App Namespaces (updateMode: Initial)”No manual action needed. VPA automatically:
- Watches pod resource usage and calculates optimal requests
- When a pod is recreated (deploy, scale, rollout), the admission controller sets optimal resources
- Running pods are not modified — changes take effect on next restart
Setting Requests vs Limits
Section titled “Setting Requests vs Limits”| Field | Rule of Thumb |
|---|---|
requests.cpu | VPA target (or 1.1-1.2x for buffer) |
requests.memory | VPA target (or 1.2-1.5x — memory OOM is fatal, CPU throttling is not) |
limits.cpu | 2-4x request (allows burst). Or omit entirely to let pods burst freely. |
limits.memory | 2-4x request (or match VPA upperBound if spikes are expected) |
Common Workload Patterns
Section titled “Common Workload Patterns”CPU-Bound (Helm rendering, image processing)
Section titled “CPU-Bound (Helm rendering, image processing)”High CPU target, low memory target. Increase CPU generously, keep memory modest.
Example: argocd-repo-server VPA target: cpu 2975m, memory 523Mi Action: cpu 3000m request, memory 768Mi requestMemory-Bound (Databases, caches)
Section titled “Memory-Bound (Databases, caches)”Low CPU target, high memory target. Increase memory, keep CPU low.
Example: Redis VPA target: cpu 23m, memory 100Mi Action: cpu 50m request, memory 128Mi requestIdle/Lightweight (UI servers, webhooks)
Section titled “Idle/Lightweight (UI servers, webhooks)”Both CPU and memory very low. Set modest requests with generous limits for occasional spikes.
Example: argocd-server VPA target: cpu 23m, memory 175Mi Action: cpu 50m request, memory 256Mi requestGPU Workloads
Section titled “GPU Workloads”VPA only tracks CPU/memory, not GPU. Recommendations will show low CPU/memory because compute happens on GPU VRAM. Set CPU/memory based on data loading needs, not inference. GPU workloads automatically get updateMode: "Off" via the Kyverno policy.
Real-World Example: ArgoCD Optimization
Section titled “Real-World Example: ArgoCD Optimization”Before (manual guesswork)
Section titled “Before (manual guesswork)”controller: cpu: 1000m, memory: 1Gi # UNDER-PROVISIONED (below lowerBound!)repo-server: cpu: 1000m, memory: 1Gi # UNDER-PROVISIONED 3xserver: cpu: 500m, memory: 512Mi # OVER-PROVISIONED 20xapplicationSet: cpu: 250m, memory: 256Mi # OVER-PROVISIONED 5xredis: cpu: 100m, memory: 128Mi # OVER-PROVISIONED 4xTotal: 2.85 CPU, 2.9Gi memoryVPA Said
Section titled “VPA Said”controller: target: 2048m CPU, 1.25Gi memory (lowerBound: 1021m > current 1000m!)repo-server: target: 2975m CPU, 523Mi memoryserver: target: 23m CPU, 175Mi memoryapplicationSet: target: 49m CPU, 100Mi memoryredis: target: 23m CPU, 100Mi memoryAfter (VPA-optimized)
Section titled “After (VPA-optimized)”controller: cpu: 2000m, memory: 4Gi # DOUBLED CPU, quadrupled memoryrepo-server: cpu: 3000m, memory: 768Mi # TRIPLED CPU, halved memoryserver: cpu: 50m, memory: 512Mi # REDUCED 10x CPUapplicationSet: cpu: 100m, memory: 128Mi # REDUCED 2.5xredis: cpu: 50m, memory: 128Mi # REDUCED 2xTotal: 5.2 CPU, 5.5Gi memoryResult: +2.35 CPU where it was needed (controller/repo-server), memory properly sized, no more throttling.
See infrastructure/controllers/argocd/values.yaml for the actual implementation with inline VPA documentation.
Initial Mode (K8s 1.35)
Section titled “Initial Mode (K8s 1.35)”This cluster runs K8s v1.35.4 where In-Place Pod Resize is GA. VPA uses updateMode: "Initial" which sets optimal resources at pod creation time only — it does not resize or evict running pods.
How It Works
Section titled “How It Works”- VPA Admission Controller intercepts new pod creation
- Sets resource requests based on current VPA recommendation
- Pod starts with right-sized resources from the beginning
- Running pods are NOT modified — changes apply on next pod restart/recreation
Namespace Strategy
Section titled “Namespace Strategy”The Kyverno vpa-auto-generate policy sets update modes automatically:
- Infrastructure/monitoring:
Off— changes go through GitOps review - User apps:
Initial— resources set at pod creation - GPU workloads:
Off— VPA can’t manage GPU resources
Kyverno VPA Policy
Section titled “Kyverno VPA Policy”How It Works
Section titled “How It Works”The vpa-auto-generate ClusterPolicy (infrastructure/controllers/kyverno/policies/vpa-auto-generate.yaml) watches for Deployment, StatefulSet, and DaemonSet resources and generates a matching VPA.
Three rules:
- generate-vpa-infra-off: Infrastructure/monitoring namespaces get
updateMode: "Off" - generate-vpa-gpu-off: GPU workloads (runtimeClassName: nvidia) get
updateMode: "Off" - generate-vpa-auto-tune: Everything else gets
updateMode: "Initial"
Generated VPAs have ownerReferences set to the parent workload, so they’re automatically cleaned up when the workload is deleted.
Excluded Namespaces
Section titled “Excluded Namespaces”kube-system— excluded from all ruleskyverno— excluded from all rules (prevents circular dependency)volsync-system— excluded from all rules (transient mover jobs)
Checking Generated VPAs
Section titled “Checking Generated VPAs”# See all Kyverno-managed VPAskubectl get vpa -A -l app.kubernetes.io/managed-by=kyverno
# Check a specific VPA's update modekubectl get vpa -n immich -o jsonpath='{.items[0].spec.updatePolicy.updateMode}'# Expected: Initial (resources set at pod creation)
kubectl get vpa -n argocd -o jsonpath='{.items[0].spec.updatePolicy.updateMode}'# Expected: OffTroubleshooting
Section titled “Troubleshooting”No recommendations showing
Section titled “No recommendations showing”- VPA needs ~5-10 minutes for initial data, 24+ hours for accuracy
- Check metrics-server:
kubectl top nodes(should return data) - Check VPA recommender:
kubectl logs -n vertical-pod-autoscaler -l app.kubernetes.io/component=recommender
VPAs not being created
Section titled “VPAs not being created”- Check Kyverno background controller:
kubectl get pods -n kyverno - Check Kyverno logs:
kubectl logs -n kyverno -l app.kubernetes.io/component=background-controller - Verify the policy is ready:
kubectl get clusterpolicy vpa-auto-generate - Check VPA CRDs are installed:
kubectl get crd verticalpodautoscalers.autoscaling.k8s.io
VPA recommendations seem too high/low
Section titled “VPA recommendations seem too high/low”- Not enough data — wait 7-14 days
- Workload changed recently — VPA is backward-looking
- Check
upperBoundfor peak usage context - Batch/cron workloads have spiky usage — use
upperBoundfor limits
Pods OOMKilled after applying VPA
Section titled “Pods OOMKilled after applying VPA”- VPA target reflects steady-state, not initialization spikes
- Set
limits.memorywell aboverequests.memory(2-4x) - Check startup memory with
kubectl top podduring pod init
Duplicate VPA resources
Section titled “Duplicate VPA resources”- Kyverno is the sole VPA creator — if you see duplicates, check for manually created VPAs
- Remove any hand-crafted VPA manifests from Git and let Kyverno manage them
Grafana Dashboard
Section titled “Grafana Dashboard”A community VPA dashboard is auto-provisioned in Grafana under the Infrastructure folder:
| Dashboard | Grafana.com ID | What It Shows |
|---|---|---|
| K8s Autoscaling VPA | 22168 | Cluster overview with drill-down to pod-level VPA details (target, lower/upper bounds) |
URL: https://grafana.vanillax.me → search for “VPA”
This dashboard reads VPA metrics exposed by kube-state-metrics Custom Resource State (kube_customresource_verticalpodautoscaler_*). Combined with vpa-report.sh, you have two ways to view VPA data:
- Grafana VPA dashboard — time-series graphs and historical trends
- CLI —
./scripts/vpa-report.shfor quick terminal output
Quick Reference
Section titled “Quick Reference”# Human-readable VPA report./scripts/vpa-report.sh./scripts/vpa-report.sh <namespace>
# All VPA recommendations (cluster-wide)kubectl get vpa -A -o custom-columns=\NS:.metadata.namespace,\NAME:.metadata.name,\CPU:.status.recommendation.containerRecommendations[0].target.cpu,\MEM:.status.recommendation.containerRecommendations[0].target.memory
# Current resource usage vs requestskubectl top pods -n <namespace>
# Compare current requests vs VPA targetkubectl get deploy <name> -n <ns> -o jsonpath='{.spec.template.spec.containers[0].resources}'kubectl get vpa <name> -n <ns> -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'
# Check Kyverno VPA policykubectl get clusterpolicy vpa-auto-generatekubectl describe clusterpolicy vpa-auto-generate
# Check VPA recommenderkubectl logs -n vertical-pod-autoscaler -l app.kubernetes.io/component=recommender
# List Kyverno-managed VPAskubectl get vpa -A -l app.kubernetes.io/managed-by=kyverno
# Monitor VPA auto-resize eventskubectl get events -A --field-selector reason=VpaUpdatedRelated Docs
Section titled “Related Docs”- Monitoring README — metrics-server vs Prometheus pipelines
- VPA component README
- Kyverno VPA policy
Last Updated: 2026-04-17 Cluster: talos-prod-cluster (K8s v1.35.4, Talos v1.13.0-rc.0)