Project Nomad: Kubernetes + OpenAI-Compatible LLM Provider
Date: 2026-03-16 Status: Draft Scope: Full — fork changes (LLMProvider abstraction + k8s manifests + CI) + this repo (ExternalSecret + deployment updates)
Overview
Section titled “Overview”Add OpenAI-compatible LLM backend support to mitchross/project-nomad so it works with llama-cpp (and any OpenAI-compatible server). Add Kubernetes manifests with Kustomize for modular deployment of the full Nomad stack (all 9 services). Every optional service supports BYO (bring-your-own via URL) or deploy (uncomment in kustomization.yaml). Update the existing deployment in this repo to use ExternalSecrets and the new configuration.
Two Repositories, Two Workstreams
Section titled “Two Repositories, Two Workstreams”| Repo | Changes | Branch |
|---|---|---|
| mitchross/project-nomad | LLMProvider abstraction, k8s manifests, GitHub Actions | feature/openai-k8s |
| talos-argocd-proxmox | ExternalSecret, updated configmap, image reference | claude/install-project-nomad-XmxFL |
Part 1: Fork Changes (mitchross/project-nomad)
Section titled “Part 1: Fork Changes (mitchross/project-nomad)”1A. LLMProvider Abstraction
Section titled “1A. LLMProvider Abstraction”Goal: Replace direct Ollama SDK usage with a provider interface. Two implementations: OllamaProvider (existing behavior) and OpenAIProvider (llama-cpp compatible).
New Files
Section titled “New Files”admin/app/services/llm/├── llm_provider.ts # Interface definition├── ollama_provider.ts # Wraps existing Ollama SDK logic├── openai_provider.ts # OpenAI-compatible HTTP client└── provider_factory.ts # Creates provider based on env configInterface Design
Section titled “Interface Design”export interface ChatMessage { role: 'system' | 'user' | 'assistant' content: string}
export interface ChatRequest { model: string messages: ChatMessage[] stream?: boolean options?: { temperature?: number num_ctx?: number num_predict?: number }}
export interface ChatResponseChunk { content: string done: boolean thinking?: string // For models that support thinking}
export interface EmbeddingResult { embeddings: number[][]}
export interface ModelInfo { name: string size?: number modified_at?: string details?: Record<string, unknown>}
export interface LLMProvider { // Core capabilities (required) chat(request: ChatRequest): Promise<string> chatStream(request: ChatRequest): AsyncGenerator<ChatResponseChunk> embed(model: string, input: string | string[]): Promise<EmbeddingResult> listModels(): Promise<ModelInfo[]>
// Model management (optional — Ollama only) supportsModelManagement(): boolean pullModel?(name: string, onProgress?: (status: string, completed?: number, total?: number) => void): Promise<void> deleteModel?(name: string): Promise<void> showModel?(name: string): Promise<Record<string, unknown> | null>
// Provider info readonly providerName: string}Provider Factory
Section titled “Provider Factory”export function createLLMProvider(): LLMProvider { const provider = env.get('LLM_PROVIDER', 'ollama') // 'ollama' | 'openai' const host = env.get('LLM_HOST')
switch (provider) { case 'openai': return new OpenAIProvider({ baseURL: host, // e.g., http://llama-cpp:8080/v1 apiKey: env.get('LLM_API_KEY', 'unused'), embeddingModel: env.get('EMBEDDING_MODEL', 'nomic-embed-text:v1.5'), embeddingDimensions: parseInt(env.get('EMBEDDING_DIMENSIONS', '768')), }) case 'ollama': default: return new OllamaProvider({ host }) }}OpenAI Provider Implementation
Section titled “OpenAI Provider Implementation”// Uses fetch() — no extra npm dependency needed// Targets /v1/chat/completions, /v1/embeddings, /v1/modelsexport class OpenAIProvider implements LLMProvider { readonly providerName = 'openai'
constructor(private config: { baseURL: string apiKey: string embeddingModel: string embeddingDimensions: number }) {}
async chat(request: ChatRequest): Promise<string> { const response = await fetch(`${this.config.baseURL}/chat/completions`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${this.config.apiKey}` }, body: JSON.stringify({ model: request.model, messages: request.messages, temperature: request.options?.temperature, max_tokens: request.options?.num_predict, stream: false, }), }) const data = await response.json() return data.choices[0].message.content }
async *chatStream(request: ChatRequest): AsyncGenerator<ChatResponseChunk> { // SSE streaming via fetch + ReadableStream // Parse "data: {...}" lines, yield { content, done } }
async embed(model: string, input: string | string[]): Promise<EmbeddingResult> { const response = await fetch(`${this.config.baseURL}/embeddings`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${this.config.apiKey}` }, body: JSON.stringify({ model: model || this.config.embeddingModel, input: Array.isArray(input) ? input : [input], }), }) const data = await response.json() return { embeddings: data.data.map((d: any) => d.embedding) } }
async listModels(): Promise<ModelInfo[]> { const response = await fetch(`${this.config.baseURL}/models`, { headers: { 'Authorization': `Bearer ${this.config.apiKey}` }, }) const data = await response.json() return data.data.map((m: any) => ({ name: m.id, size: 0 })) }
supportsModelManagement(): boolean { return false }}Refactoring Existing Services
Section titled “Refactoring Existing Services”OllamaService → LLMService:
- Move Ollama-specific logic into
OllamaProvider LLMServiceholds aLLMProviderinstance from factory- Docker-based service discovery moves into
OllamaProvideronly (env-based fallback) LLM_HOSTenv var bypasses Docker discovery entirely
RagService changes:
- Replace
this.ollamaService.ollama.embed(...)withthis.llmService.provider.embed(...) - Make
EMBEDDING_MODELandEMBEDDING_DIMENSIONconfigurable via env vars - Keep Nomic-specific prefixes (
search_document:,search_query:) as defaults, configurable via env
OllamaController changes:
- Rename to
ChatController(or keep for backwards compat) - Use
LLMServiceinstead of directOllamaService - Streaming handler adapts to unified
ChatResponseChunkformat - Model management endpoints: return 501 if
!provider.supportsModelManagement()
BenchmarkService changes:
- Use
LLMServicefor inference calls - Model detection (size parsing from name) stays as-is
New Environment Variables
Section titled “New Environment Variables”# LLM Provider ConfigurationLLM_PROVIDER=openai # 'ollama' or 'openai' (default: ollama)LLM_HOST=http://llama-cpp:8080/v1 # Base URL for LLM APILLM_API_KEY=unused # API key (unused for local servers)
# Embedding ConfigurationEMBEDDING_MODEL=nomic-embed-text:v1.5 # Model name for embeddingsEMBEDDING_DIMENSIONS=768 # Vector dimensions (must match Qdrant collection)
# Legacy (still works, maps to LLM_HOST if LLM_PROVIDER not set)OLLAMA_HOST=http://ollama:11434 # Backwards compatible
# Companion Service URLs (default = in-cluster, override for BYO)KIWIX_URL=http://kiwix:8080 # Override with external URL for BYOKOLIBRI_URL=http://kolibri:8080 # Override with external URL for BYOPROTOMAPS_URL=http://protomaps:8080CYBERCHEF_URL=http://cyberchef:8080FLATNOTES_URL=http://flatnotes:8080BYO pattern: If the URL env var is set, Nomad’s UI links/iframes point to that external URL. If empty and the service is deployed in-cluster, it auto-resolves to <service>.project-nomad.svc.cluster.local.
Backwards compatibility: If LLM_PROVIDER is not set but OLLAMA_HOST is, default to Ollama provider with that host.
1B. Kubernetes Manifests
Section titled “1B. Kubernetes Manifests”Location: k8s/ directory in the fork
Full Service Matrix
Section titled “Full Service Matrix”All 9 services are in scope and will be deployed. Each supports BYO (set URL env var to use an external instance instead of deploying).
| Service | Image | BYO Config | Ports | Storage |
|---|---|---|---|---|
| Nomad (admin) | ghcr.io/mitchross/project-nomad:main | — | 8080 | PVC 10Gi (uploads/ZIM) |
| MySQL | mysql:8.0 | DB_HOST, DB_PORT, DB_USER | 3306 | PVC 10Gi |
| Redis | redis:7-alpine | REDIS_HOST, REDIS_PORT | 6379 | — |
| Qdrant | qdrant/qdrant:latest | QDRANT_HOST | 6333, 6334 | PVC 5Gi |
| Kiwix | ghcr.io/kiwix/kiwix-serve:3.8.1 | KIWIX_URL | 8080 | PVC (shares Nomad’s ZIM dir) |
| Kolibri | learningequality/kolibri:latest | KOLIBRI_URL | 8080 | PVC 10Gi |
| ProtoMaps | protomaps/go-pmtiles:latest | PROTOMAPS_URL | 8080 | PVC (map tiles) |
| CyberChef | ghcr.io/gchq/cyberchef:latest | CYBERCHEF_URL | 8080 | — |
| FlatNotes | dullage/flatnotes:latest | FLATNOTES_URL | 8080 | PVC 1Gi |
BYO vs Deploy Pattern
Section titled “BYO vs Deploy Pattern”For every service, the user has two choices:
- BYO — Already have it running? Set the URL env var in the ConfigMap. Don’t include the service in kustomization.yaml.
- Deploy — Don’t have it? Include the service directory in kustomization.yaml. The ConfigMap defaults point to the in-cluster service.
The ConfigMap always has the URL vars. The Nomad app always reads them. The only question is: does the URL point to an external service or an in-cluster one?
# configmap.yaml — service URLs section (all default to in-cluster services)data: # Core services DB_HOST: "mysql" # Override for BYO MySQL REDIS_HOST: "redis" # Override for BYO Redis QDRANT_HOST: "http://qdrant:6333" # Override for BYO Qdrant
# Companion services (all deployed by default, override URL for BYO) KIWIX_URL: "http://kiwix:8080" # Override for BYO Kiwix KOLIBRI_URL: "http://kolibri:8080" # Override for BYO Kolibri PROTOMAPS_URL: "http://protomaps:8080" # Override for BYO ProtoMaps CYBERCHEF_URL: "http://cyberchef:8080" # Override for BYO CyberChef FLATNOTES_URL: "http://flatnotes:8080" # Override for BYO FlatNotesDirectory Structure
Section titled “Directory Structure”k8s/├── base/│ ├── nomad/│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ ├── configmap.yaml # All env vars including service URLs│ │ └── kustomization.yaml│ ├── mysql/│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ ├── pvc.yaml│ │ └── kustomization.yaml│ ├── redis/│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ └── kustomization.yaml│ ├── qdrant/│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ ├── pvc.yaml│ │ └── kustomization.yaml│ ├── kiwix/ # Optional — offline Wikipedia│ │ ├── deployment.yaml # Serves ZIM files on port 8080│ │ ├── service.yaml│ │ └── kustomization.yaml # NOTE: shares Nomad's storage PVC for ZIM files│ ├── kolibri/ # Optional — Khan Academy│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ ├── pvc.yaml│ │ └── kustomization.yaml│ ├── protomaps/ # Optional — offline maps│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ ├── pvc.yaml│ │ └── kustomization.yaml│ ├── cyberchef/ # Optional — data tools (stateless)│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ └── kustomization.yaml│ ├── flatnotes/ # Optional — note-taking│ │ ├── deployment.yaml│ │ ├── service.yaml│ │ ├── pvc.yaml│ │ └── kustomization.yaml│ ├── namespace.yaml│ └── kustomization.yaml # Toggle services here└── overlays/ └── production/ # Example production overlay ├── kustomization.yaml # Patches for production └── patches/ └── nomad-config.yaml # Override service URLs for BYODesign principles:
- Each service is a separate Kustomize component in
base/ - All 9 services included by default — comment out + set BYO URL to use external
- ConfigMap always has URL vars — deploying a service just means the URL points to the in-cluster name
- BYO = set URL in overlay patch, comment out the service directory
base/uses generic defaults (no cluster-specific values)overlays/production/shows how to customize for a specific cluster- No Helm — pure Kustomize as requested
Base kustomization.yaml (full stack — all services deployed):
apiVersion: kustomize.config.k8s.io/v1beta1kind: Kustomizationnamespace: project-nomad
resources: - namespace.yaml # Core - nomad/ - mysql/ # Comment out + set DB_HOST for BYO - redis/ # Comment out + set REDIS_HOST for BYO - qdrant/ # Comment out + set QDRANT_HOST for BYO # Companion services - kiwix/ # Comment out + set KIWIX_URL for BYO - kolibri/ # Comment out + set KOLIBRI_URL for BYO - protomaps/ # Comment out + set PROTOMAPS_URL for BYO - cyberchef/ # Comment out + set CYBERCHEF_URL for BYO - flatnotes/ # Comment out + set FLATNOTES_URL for BYOExample BYO overlay (user has external Kiwix + Redis, deploys everything else):
apiVersion: v1kind: ConfigMapmetadata: name: project-nomad-configdata: KIWIX_URL: "http://192.168.10.50:8080" # BYO Kiwix on LAN REDIS_HOST: "redis.my-other-namespace.svc.cluster.local" # BYO RedisQdrant deployment (new — currently not in Docker compose for management, but Nomad uses it for RAG):
- Image:
qdrant/qdrant:latest - Port: 6333 (HTTP) + 6334 (gRPC)
- PVC: 5Gi for vector storage
- No GPU needed
1C. GitHub Actions
Section titled “1C. GitHub Actions”.github/workflows/├── build.yaml # Build + push Docker image on push/PR└── test.yaml # Run tests (existing or new)build.yaml:
name: Build and Push Docker Image
on: push: branches: [main, develop, 'feature/**'] tags: ['v*'] pull_request: branches: [main]
env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }}
jobs: build: runs-on: ubuntu-latest permissions: contents: read packages: write
steps: - uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/metadata-action@v5 id: meta with: images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} tags: | type=ref,event=branch type=ref,event=pr type=semver,pattern={{version}} type=sha
- uses: docker/build-push-action@v5 with: context: ./admin push: ${{ github.event_name != 'pull_request' }} tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} cache-from: type=gha cache-to: type=gha,mode=maxResult: Images published to ghcr.io/mitchross/project-nomad:<tag>
Part 2: This Repo (talos-argocd-proxmox)
Section titled “Part 2: This Repo (talos-argocd-proxmox)”2A. ExternalSecret Migration
Section titled “2A. ExternalSecret Migration”Replace my-apps/home/project-nomad/secret.yaml with externalsecret.yaml:
apiVersion: external-secrets.io/v1kind: ExternalSecretmetadata: name: project-nomad-secrets namespace: project-nomadspec: refreshInterval: "1h" secretStoreRef: kind: ClusterSecretStore name: 1password target: name: project-nomad-secrets creationPolicy: Owner data: - secretKey: APP_KEY remoteRef: key: project-nomad # 1Password item name property: app_key - secretKey: DB_PASSWORD remoteRef: key: project-nomad property: db_password - secretKey: MYSQL_ROOT_PASSWORD remoteRef: key: project-nomad property: db_password # Same value for both1Password item to create: project-nomad in homelab-prod vault with fields:
app_key— random 32+ char stringdb_password— MySQL root password
2B. Updated ConfigMap
Section titled “2B. Updated ConfigMap”apiVersion: v1kind: ConfigMapmetadata: name: project-nomad-config namespace: project-nomaddata: PORT: "8080" HOST: "0.0.0.0" LOG_LEVEL: "info" NODE_ENV: "production" SESSION_DRIVER: "cookie" DB_HOST: "mysql" DB_PORT: "3306" DB_USER: "root" DB_DATABASE: "nomad" DB_NAME: "nomad" DB_SSL: "false" REDIS_HOST: "redis-master.redis-instance.svc.cluster.local" REDIS_PORT: "6379" NOMAD_STORAGE_PATH: "/opt/project-nomad/storage" URL: "https://nomad.vanillax.me" # LLM Provider Configuration LLM_PROVIDER: "openai" LLM_HOST: "http://llama-cpp-service.llama-cpp.svc.cluster.local:8080/v1" LLM_API_KEY: "unused" # Embedding (uses llama-cpp server) EMBEDDING_MODEL: "" # Use server default EMBEDDING_DIMENSIONS: "768" # Qdrant QDRANT_HOST: "http://qdrant.project-nomad.svc.cluster.local:6333" # Companion services (all deployed in-cluster, override for BYO) KIWIX_URL: "http://kiwix.project-nomad.svc.cluster.local:8080" KOLIBRI_URL: "http://kolibri.project-nomad.svc.cluster.local:8080" PROTOMAPS_URL: "http://protomaps.project-nomad.svc.cluster.local:8080" CYBERCHEF_URL: "http://cyberchef.project-nomad.svc.cluster.local:8080" FLATNOTES_URL: "http://flatnotes.project-nomad.svc.cluster.local:8080"2C. Updated Deployment
Section titled “2C. Updated Deployment”- Change image from
ghcr.io/crosstalk-solutions/project-nomad:latesttoghcr.io/mitchross/project-nomad:main - Remove
OLLAMA_HOSTfrom configmap (replaced byLLM_HOST) - Add
QDRANT_HOSTenv var
2D. Kustomization Update
Section titled “2D. Kustomization Update”apiVersion: kustomize.config.k8s.io/v1beta1kind: Kustomizationnamespace: project-nomad
resources: - namespace.yaml - externalsecret.yaml # Was: secret.yaml - configmap.yaml # Core - pvc.yaml - deployment.yaml - service.yaml - httproute.yaml # MySQL - mysql-deployment.yaml - mysql-service.yaml - mysql-pvc.yaml # Qdrant - qdrant-deployment.yaml - qdrant-service.yaml - qdrant-pvc.yaml # Redis - redis-deployment.yaml - redis-service.yaml # Kiwix - kiwix-deployment.yaml - kiwix-service.yaml # Kolibri - kolibri-deployment.yaml - kolibri-service.yaml - kolibri-pvc.yaml # ProtoMaps - protomaps-deployment.yaml - protomaps-service.yaml - protomaps-pvc.yaml # CyberChef - cyberchef-deployment.yaml - cyberchef-service.yaml # FlatNotes - flatnotes-deployment.yaml - flatnotes-service.yaml - flatnotes-pvc.yamlImplementation Order
Section titled “Implementation Order”Phase 1: Fork — LLMProvider Abstraction
Section titled “Phase 1: Fork — LLMProvider Abstraction”- Create
admin/app/services/llm/directory with interface + factory - Implement
OllamaProvider(extract from existingOllamaService) - Implement
OpenAIProvider(new, fetch-based) - Refactor
OllamaService→LLMServiceto use provider - Update
RagServiceto useLLMService+ configurable embedding model - Update
OllamaControllerstreaming to use unified format - Update
BenchmarkServiceto useLLMService - Add new env vars to
.env.example - Update Docker service discovery to be optional (env-based fallback)
Phase 2: Fork — K8s Manifests + CI
Section titled “Phase 2: Fork — K8s Manifests + CI”- Create
k8s/base/core services: nomad, mysql, redis, qdrant (+ configmap with all service URLs) - Create
k8s/base/optional services: kiwix, kolibri, protomaps, cyberchef, flatnotes - Create
k8s/overlays/production/example with BYO patch - Add Dockerfile improvements if needed (multi-stage, etc.)
- Add
.github/workflows/build.yamlfor GHCR publishing - Test image build
Phase 3: This Repo — Deployment Updates
Section titled “Phase 3: This Repo — Deployment Updates”- Create 1Password item
project-nomad - Replace
secret.yamlwithexternalsecret.yaml - Update
configmap.yamlwith LLM env vars + all service URLs (pointing to in-cluster defaults) - Update
deployment.yamlimage toghcr.io/mitchross/project-nomad:main - Update
kustomization.yaml - Commit and push
Risks & Mitigations
Section titled “Risks & Mitigations”| Risk | Impact | Mitigation |
|---|---|---|
| Embedding model mismatch | Knowledge base vectors incompatible | Make embedding model configurable; document that changing models requires re-embedding |
| llama-cpp doesn’t serve embeddings | RAG broken | llama-cpp supports /v1/embeddings — verify model is loaded with embedding support |
| Streaming format differences | Chat broken | Unified ChatResponseChunk type + thorough testing of both providers |
| Ollama-specific features (thinking, model show) | Feature regression | Graceful degradation — return null/empty when provider doesn’t support |
| Qdrant not deployed | RAG broken | Include Qdrant in k8s base manifests; make RAG optional if Qdrant not reachable |
Not In Scope
Section titled “Not In Scope”- Changing the upstream project-nomad (only fork changes)
- CNPG migration for MySQL (stays as simple deployment for now)
- GPU support for Nomad itself (it calls external LLM services)
- Automated vector migration tooling (manual re-embed if model changes)
- Modifying Nomad’s Docker-based service management code to use K8s APIs (we bypass it entirely via env vars)