Skip to content

Project Nomad: Kubernetes + OpenAI-Compatible LLM Provider

Date: 2026-03-16 Status: Draft Scope: Full — fork changes (LLMProvider abstraction + k8s manifests + CI) + this repo (ExternalSecret + deployment updates)


Add OpenAI-compatible LLM backend support to mitchross/project-nomad so it works with llama-cpp (and any OpenAI-compatible server). Add Kubernetes manifests with Kustomize for modular deployment of the full Nomad stack (all 9 services). Every optional service supports BYO (bring-your-own via URL) or deploy (uncomment in kustomization.yaml). Update the existing deployment in this repo to use ExternalSecrets and the new configuration.

RepoChangesBranch
mitchross/project-nomadLLMProvider abstraction, k8s manifests, GitHub Actionsfeature/openai-k8s
talos-argocd-proxmoxExternalSecret, updated configmap, image referenceclaude/install-project-nomad-XmxFL

Part 1: Fork Changes (mitchross/project-nomad)

Section titled “Part 1: Fork Changes (mitchross/project-nomad)”

Goal: Replace direct Ollama SDK usage with a provider interface. Two implementations: OllamaProvider (existing behavior) and OpenAIProvider (llama-cpp compatible).

admin/app/services/llm/
├── llm_provider.ts # Interface definition
├── ollama_provider.ts # Wraps existing Ollama SDK logic
├── openai_provider.ts # OpenAI-compatible HTTP client
└── provider_factory.ts # Creates provider based on env config
admin/app/services/llm/llm_provider.ts
export interface ChatMessage {
role: 'system' | 'user' | 'assistant'
content: string
}
export interface ChatRequest {
model: string
messages: ChatMessage[]
stream?: boolean
options?: {
temperature?: number
num_ctx?: number
num_predict?: number
}
}
export interface ChatResponseChunk {
content: string
done: boolean
thinking?: string // For models that support thinking
}
export interface EmbeddingResult {
embeddings: number[][]
}
export interface ModelInfo {
name: string
size?: number
modified_at?: string
details?: Record<string, unknown>
}
export interface LLMProvider {
// Core capabilities (required)
chat(request: ChatRequest): Promise<string>
chatStream(request: ChatRequest): AsyncGenerator<ChatResponseChunk>
embed(model: string, input: string | string[]): Promise<EmbeddingResult>
listModels(): Promise<ModelInfo[]>
// Model management (optional — Ollama only)
supportsModelManagement(): boolean
pullModel?(name: string, onProgress?: (status: string, completed?: number, total?: number) => void): Promise<void>
deleteModel?(name: string): Promise<void>
showModel?(name: string): Promise<Record<string, unknown> | null>
// Provider info
readonly providerName: string
}
admin/app/services/llm/provider_factory.ts
export function createLLMProvider(): LLMProvider {
const provider = env.get('LLM_PROVIDER', 'ollama') // 'ollama' | 'openai'
const host = env.get('LLM_HOST')
switch (provider) {
case 'openai':
return new OpenAIProvider({
baseURL: host, // e.g., http://llama-cpp:8080/v1
apiKey: env.get('LLM_API_KEY', 'unused'),
embeddingModel: env.get('EMBEDDING_MODEL', 'nomic-embed-text:v1.5'),
embeddingDimensions: parseInt(env.get('EMBEDDING_DIMENSIONS', '768')),
})
case 'ollama':
default:
return new OllamaProvider({ host })
}
}
admin/app/services/llm/openai_provider.ts
// Uses fetch() — no extra npm dependency needed
// Targets /v1/chat/completions, /v1/embeddings, /v1/models
export class OpenAIProvider implements LLMProvider {
readonly providerName = 'openai'
constructor(private config: {
baseURL: string
apiKey: string
embeddingModel: string
embeddingDimensions: number
}) {}
async chat(request: ChatRequest): Promise<string> {
const response = await fetch(`${this.config.baseURL}/chat/completions`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${this.config.apiKey}` },
body: JSON.stringify({
model: request.model,
messages: request.messages,
temperature: request.options?.temperature,
max_tokens: request.options?.num_predict,
stream: false,
}),
})
const data = await response.json()
return data.choices[0].message.content
}
async *chatStream(request: ChatRequest): AsyncGenerator<ChatResponseChunk> {
// SSE streaming via fetch + ReadableStream
// Parse "data: {...}" lines, yield { content, done }
}
async embed(model: string, input: string | string[]): Promise<EmbeddingResult> {
const response = await fetch(`${this.config.baseURL}/embeddings`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${this.config.apiKey}` },
body: JSON.stringify({
model: model || this.config.embeddingModel,
input: Array.isArray(input) ? input : [input],
}),
})
const data = await response.json()
return { embeddings: data.data.map((d: any) => d.embedding) }
}
async listModels(): Promise<ModelInfo[]> {
const response = await fetch(`${this.config.baseURL}/models`, {
headers: { 'Authorization': `Bearer ${this.config.apiKey}` },
})
const data = await response.json()
return data.data.map((m: any) => ({ name: m.id, size: 0 }))
}
supportsModelManagement(): boolean { return false }
}

OllamaServiceLLMService:

  • Move Ollama-specific logic into OllamaProvider
  • LLMService holds a LLMProvider instance from factory
  • Docker-based service discovery moves into OllamaProvider only (env-based fallback)
  • LLM_HOST env var bypasses Docker discovery entirely

RagService changes:

  • Replace this.ollamaService.ollama.embed(...) with this.llmService.provider.embed(...)
  • Make EMBEDDING_MODEL and EMBEDDING_DIMENSION configurable via env vars
  • Keep Nomic-specific prefixes (search_document:, search_query:) as defaults, configurable via env

OllamaController changes:

  • Rename to ChatController (or keep for backwards compat)
  • Use LLMService instead of direct OllamaService
  • Streaming handler adapts to unified ChatResponseChunk format
  • Model management endpoints: return 501 if !provider.supportsModelManagement()

BenchmarkService changes:

  • Use LLMService for inference calls
  • Model detection (size parsing from name) stays as-is
# LLM Provider Configuration
LLM_PROVIDER=openai # 'ollama' or 'openai' (default: ollama)
LLM_HOST=http://llama-cpp:8080/v1 # Base URL for LLM API
LLM_API_KEY=unused # API key (unused for local servers)
# Embedding Configuration
EMBEDDING_MODEL=nomic-embed-text:v1.5 # Model name for embeddings
EMBEDDING_DIMENSIONS=768 # Vector dimensions (must match Qdrant collection)
# Legacy (still works, maps to LLM_HOST if LLM_PROVIDER not set)
OLLAMA_HOST=http://ollama:11434 # Backwards compatible
# Companion Service URLs (default = in-cluster, override for BYO)
KIWIX_URL=http://kiwix:8080 # Override with external URL for BYO
KOLIBRI_URL=http://kolibri:8080 # Override with external URL for BYO
PROTOMAPS_URL=http://protomaps:8080
CYBERCHEF_URL=http://cyberchef:8080
FLATNOTES_URL=http://flatnotes:8080

BYO pattern: If the URL env var is set, Nomad’s UI links/iframes point to that external URL. If empty and the service is deployed in-cluster, it auto-resolves to <service>.project-nomad.svc.cluster.local.

Backwards compatibility: If LLM_PROVIDER is not set but OLLAMA_HOST is, default to Ollama provider with that host.

Location: k8s/ directory in the fork

All 9 services are in scope and will be deployed. Each supports BYO (set URL env var to use an external instance instead of deploying).

ServiceImageBYO ConfigPortsStorage
Nomad (admin)ghcr.io/mitchross/project-nomad:main8080PVC 10Gi (uploads/ZIM)
MySQLmysql:8.0DB_HOST, DB_PORT, DB_USER3306PVC 10Gi
Redisredis:7-alpineREDIS_HOST, REDIS_PORT6379
Qdrantqdrant/qdrant:latestQDRANT_HOST6333, 6334PVC 5Gi
Kiwixghcr.io/kiwix/kiwix-serve:3.8.1KIWIX_URL8080PVC (shares Nomad’s ZIM dir)
Kolibrilearningequality/kolibri:latestKOLIBRI_URL8080PVC 10Gi
ProtoMapsprotomaps/go-pmtiles:latestPROTOMAPS_URL8080PVC (map tiles)
CyberChefghcr.io/gchq/cyberchef:latestCYBERCHEF_URL8080
FlatNotesdullage/flatnotes:latestFLATNOTES_URL8080PVC 1Gi

For every service, the user has two choices:

  1. BYO — Already have it running? Set the URL env var in the ConfigMap. Don’t include the service in kustomization.yaml.
  2. Deploy — Don’t have it? Include the service directory in kustomization.yaml. The ConfigMap defaults point to the in-cluster service.

The ConfigMap always has the URL vars. The Nomad app always reads them. The only question is: does the URL point to an external service or an in-cluster one?

# configmap.yaml — service URLs section (all default to in-cluster services)
data:
# Core services
DB_HOST: "mysql" # Override for BYO MySQL
REDIS_HOST: "redis" # Override for BYO Redis
QDRANT_HOST: "http://qdrant:6333" # Override for BYO Qdrant
# Companion services (all deployed by default, override URL for BYO)
KIWIX_URL: "http://kiwix:8080" # Override for BYO Kiwix
KOLIBRI_URL: "http://kolibri:8080" # Override for BYO Kolibri
PROTOMAPS_URL: "http://protomaps:8080" # Override for BYO ProtoMaps
CYBERCHEF_URL: "http://cyberchef:8080" # Override for BYO CyberChef
FLATNOTES_URL: "http://flatnotes:8080" # Override for BYO FlatNotes
k8s/
├── base/
│ ├── nomad/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── configmap.yaml # All env vars including service URLs
│ │ └── kustomization.yaml
│ ├── mysql/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── pvc.yaml
│ │ └── kustomization.yaml
│ ├── redis/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ └── kustomization.yaml
│ ├── qdrant/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── pvc.yaml
│ │ └── kustomization.yaml
│ ├── kiwix/ # Optional — offline Wikipedia
│ │ ├── deployment.yaml # Serves ZIM files on port 8080
│ │ ├── service.yaml
│ │ └── kustomization.yaml # NOTE: shares Nomad's storage PVC for ZIM files
│ ├── kolibri/ # Optional — Khan Academy
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── pvc.yaml
│ │ └── kustomization.yaml
│ ├── protomaps/ # Optional — offline maps
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── pvc.yaml
│ │ └── kustomization.yaml
│ ├── cyberchef/ # Optional — data tools (stateless)
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ └── kustomization.yaml
│ ├── flatnotes/ # Optional — note-taking
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── pvc.yaml
│ │ └── kustomization.yaml
│ ├── namespace.yaml
│ └── kustomization.yaml # Toggle services here
└── overlays/
└── production/ # Example production overlay
├── kustomization.yaml # Patches for production
└── patches/
└── nomad-config.yaml # Override service URLs for BYO

Design principles:

  • Each service is a separate Kustomize component in base/
  • All 9 services included by default — comment out + set BYO URL to use external
  • ConfigMap always has URL vars — deploying a service just means the URL points to the in-cluster name
  • BYO = set URL in overlay patch, comment out the service directory
  • base/ uses generic defaults (no cluster-specific values)
  • overlays/production/ shows how to customize for a specific cluster
  • No Helm — pure Kustomize as requested

Base kustomization.yaml (full stack — all services deployed):

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: project-nomad
resources:
- namespace.yaml
# Core
- nomad/
- mysql/ # Comment out + set DB_HOST for BYO
- redis/ # Comment out + set REDIS_HOST for BYO
- qdrant/ # Comment out + set QDRANT_HOST for BYO
# Companion services
- kiwix/ # Comment out + set KIWIX_URL for BYO
- kolibri/ # Comment out + set KOLIBRI_URL for BYO
- protomaps/ # Comment out + set PROTOMAPS_URL for BYO
- cyberchef/ # Comment out + set CYBERCHEF_URL for BYO
- flatnotes/ # Comment out + set FLATNOTES_URL for BYO

Example BYO overlay (user has external Kiwix + Redis, deploys everything else):

overlays/production/patches/nomad-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: project-nomad-config
data:
KIWIX_URL: "http://192.168.10.50:8080" # BYO Kiwix on LAN
REDIS_HOST: "redis.my-other-namespace.svc.cluster.local" # BYO Redis

Qdrant deployment (new — currently not in Docker compose for management, but Nomad uses it for RAG):

  • Image: qdrant/qdrant:latest
  • Port: 6333 (HTTP) + 6334 (gRPC)
  • PVC: 5Gi for vector storage
  • No GPU needed
.github/workflows/
├── build.yaml # Build + push Docker image on push/PR
└── test.yaml # Run tests (existing or new)

build.yaml:

name: Build and Push Docker Image
on:
push:
branches: [main, develop, 'feature/**']
tags: ['v*']
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/metadata-action@v5
id: meta
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=sha
- uses: docker/build-push-action@v5
with:
context: ./admin
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max

Result: Images published to ghcr.io/mitchross/project-nomad:<tag>


Replace my-apps/home/project-nomad/secret.yaml with externalsecret.yaml:

my-apps/home/project-nomad/externalsecret.yaml
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: project-nomad-secrets
namespace: project-nomad
spec:
refreshInterval: "1h"
secretStoreRef:
kind: ClusterSecretStore
name: 1password
target:
name: project-nomad-secrets
creationPolicy: Owner
data:
- secretKey: APP_KEY
remoteRef:
key: project-nomad # 1Password item name
property: app_key
- secretKey: DB_PASSWORD
remoteRef:
key: project-nomad
property: db_password
- secretKey: MYSQL_ROOT_PASSWORD
remoteRef:
key: project-nomad
property: db_password # Same value for both

1Password item to create: project-nomad in homelab-prod vault with fields:

  • app_key — random 32+ char string
  • db_password — MySQL root password
my-apps/home/project-nomad/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: project-nomad-config
namespace: project-nomad
data:
PORT: "8080"
HOST: "0.0.0.0"
LOG_LEVEL: "info"
NODE_ENV: "production"
SESSION_DRIVER: "cookie"
DB_HOST: "mysql"
DB_PORT: "3306"
DB_USER: "root"
DB_DATABASE: "nomad"
DB_NAME: "nomad"
DB_SSL: "false"
REDIS_HOST: "redis-master.redis-instance.svc.cluster.local"
REDIS_PORT: "6379"
NOMAD_STORAGE_PATH: "/opt/project-nomad/storage"
URL: "https://nomad.vanillax.me"
# LLM Provider Configuration
LLM_PROVIDER: "openai"
LLM_HOST: "http://llama-cpp-service.llama-cpp.svc.cluster.local:8080/v1"
LLM_API_KEY: "unused"
# Embedding (uses llama-cpp server)
EMBEDDING_MODEL: "" # Use server default
EMBEDDING_DIMENSIONS: "768"
# Qdrant
QDRANT_HOST: "http://qdrant.project-nomad.svc.cluster.local:6333"
# Companion services (all deployed in-cluster, override for BYO)
KIWIX_URL: "http://kiwix.project-nomad.svc.cluster.local:8080"
KOLIBRI_URL: "http://kolibri.project-nomad.svc.cluster.local:8080"
PROTOMAPS_URL: "http://protomaps.project-nomad.svc.cluster.local:8080"
CYBERCHEF_URL: "http://cyberchef.project-nomad.svc.cluster.local:8080"
FLATNOTES_URL: "http://flatnotes.project-nomad.svc.cluster.local:8080"
  • Change image from ghcr.io/crosstalk-solutions/project-nomad:latest to ghcr.io/mitchross/project-nomad:main
  • Remove OLLAMA_HOST from configmap (replaced by LLM_HOST)
  • Add QDRANT_HOST env var
my-apps/home/project-nomad/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: project-nomad
resources:
- namespace.yaml
- externalsecret.yaml # Was: secret.yaml
- configmap.yaml
# Core
- pvc.yaml
- deployment.yaml
- service.yaml
- httproute.yaml
# MySQL
- mysql-deployment.yaml
- mysql-service.yaml
- mysql-pvc.yaml
# Qdrant
- qdrant-deployment.yaml
- qdrant-service.yaml
- qdrant-pvc.yaml
# Redis
- redis-deployment.yaml
- redis-service.yaml
# Kiwix
- kiwix-deployment.yaml
- kiwix-service.yaml
# Kolibri
- kolibri-deployment.yaml
- kolibri-service.yaml
- kolibri-pvc.yaml
# ProtoMaps
- protomaps-deployment.yaml
- protomaps-service.yaml
- protomaps-pvc.yaml
# CyberChef
- cyberchef-deployment.yaml
- cyberchef-service.yaml
# FlatNotes
- flatnotes-deployment.yaml
- flatnotes-service.yaml
- flatnotes-pvc.yaml

  1. Create admin/app/services/llm/ directory with interface + factory
  2. Implement OllamaProvider (extract from existing OllamaService)
  3. Implement OpenAIProvider (new, fetch-based)
  4. Refactor OllamaServiceLLMService to use provider
  5. Update RagService to use LLMService + configurable embedding model
  6. Update OllamaController streaming to use unified format
  7. Update BenchmarkService to use LLMService
  8. Add new env vars to .env.example
  9. Update Docker service discovery to be optional (env-based fallback)
  1. Create k8s/base/ core services: nomad, mysql, redis, qdrant (+ configmap with all service URLs)
  2. Create k8s/base/ optional services: kiwix, kolibri, protomaps, cyberchef, flatnotes
  3. Create k8s/overlays/production/ example with BYO patch
  4. Add Dockerfile improvements if needed (multi-stage, etc.)
  5. Add .github/workflows/build.yaml for GHCR publishing
  6. Test image build
  1. Create 1Password item project-nomad
  2. Replace secret.yaml with externalsecret.yaml
  3. Update configmap.yaml with LLM env vars + all service URLs (pointing to in-cluster defaults)
  4. Update deployment.yaml image to ghcr.io/mitchross/project-nomad:main
  5. Update kustomization.yaml
  6. Commit and push

RiskImpactMitigation
Embedding model mismatchKnowledge base vectors incompatibleMake embedding model configurable; document that changing models requires re-embedding
llama-cpp doesn’t serve embeddingsRAG brokenllama-cpp supports /v1/embeddings — verify model is loaded with embedding support
Streaming format differencesChat brokenUnified ChatResponseChunk type + thorough testing of both providers
Ollama-specific features (thinking, model show)Feature regressionGraceful degradation — return null/empty when provider doesn’t support
Qdrant not deployedRAG brokenInclude Qdrant in k8s base manifests; make RAG optional if Qdrant not reachable

  • Changing the upstream project-nomad (only fork changes)
  • CNPG migration for MySQL (stays as simple deployment for now)
  • GPU support for Nomad itself (it calls external LLM services)
  • Automated vector migration tooling (manual re-embed if model changes)
  • Modifying Nomad’s Docker-based service management code to use K8s APIs (we bypass it entirely via env vars)