T

Abiba (pi) 7b6c6aabe1 Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

- Complexity-based routing (MoE default, Dense heavy, Gemma light)
- Per-agent API keys with metrics tracking
- Time-series usage graphs (24h/7d/30d)
- Streaming support (SSE passthrough)
- Unicode cleanup (ASCII-only output)
- Vision support (gemma-4-E4B)
- Tier enforcement (starter/professional/enterprise)
- GPU health monitoring via sidecar polling
- Unified dashboard with line graph

2026-05-16 18:51:50 +00:00

dashboard

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

nginx

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

router

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

.gitignore

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

docker-compose.yml

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

litellm_config.yaml

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

README.md

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

README.md

syslog-harness — Inference API Harness

CT 116 Docker stack for routing local GPU models through a unified OpenAI-compatible API.

Architecture

nginx :80 → router :9000 → GPU backends
                ├─ qwen3.6-35B-A3B (MoE) @ 192.168.68.15:8080
                ├─ qwen3.6-27B-code (Dense) @ 192.168.68.8:8080
                └─ gemma-4-E4B (Light) @ 192.168.68.110:8080

LiteLLM :8081 (fallback) | Dashboard :3000 | Redis :6379 (local)

Deploy

cd /opt/inference-harness
docker compose up -d

Endpoints

URL	Purpose
`/v1/chat/completions`	Inference API (OpenAI-compatible)
`/v1/models`	Available models
`/`	Dashboard (GPU health, routing, agents, timeseries)

Agent API Keys

Agent	Key
Abiba	`sk-syslog-abiba`
Mumuni	`sk-syslog-mumuni`
Tanko	`sk-syslog-tanko`
Koby	`sk-syslog-koby`
Kagenz0	`sk-syslog-kagenz0`