654cdff718bfb53848bfd7dd9767fbf910f73511
syslog-harness — Inference API Harness
CT 116 Docker stack for routing local GPU models through a unified OpenAI-compatible API.
Architecture
nginx :80 → router :9000 → GPU backends
├─ qwen3.6-35B-A3B (MoE) @ 192.168.68.15:8080
├─ qwen3.6-27B-code (Dense) @ 192.168.68.8:8080
└─ gemma-4-E4B (Light) @ 192.168.68.110:8080
LiteLLM :8081 (fallback) | Dashboard :3000 | Redis :6379 (local)
Deploy
cd /opt/inference-harness
docker compose up -d
Endpoints
| URL | Purpose |
|---|---|
/v1/chat/completions |
Inference API (OpenAI-compatible) |
/v1/models |
Available models |
/ |
Dashboard (GPU health, routing, agents, timeseries) |
Agent API Keys
| Agent | Key |
|---|---|
| Abiba | sk-syslog-abiba |
| Mumuni | sk-syslog-mumuni |
| Tanko | sk-syslog-tanko |
| Koby | sk-syslog-koby |
| Kagenz0 | sk-syslog-kagenz0 |
Description
Syslog Operational Agent Harness — Nginx routing, Redis queue, circuit breaker, monitoring, Docker migration
Languages
Python
99.2%
Dockerfile
0.8%