SyslogSolution/syslog-harness

T

Abiba (pi) 654cdff718 Dashboard: GPU slot indicators show active/max concurrent requests. Koonimo API key added. Real-time queuing visibility.

2026-05-16 20:43:22 +00:00

Dashboard: rename to SyslogAI Harness, GPU bar now shows utilization instead of VRAM

2026-05-16 19:26:46 +00:00

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

Add queue service

2026-05-15 21:07:05 +00:00

Dashboard: GPU slot indicators show active/max concurrent requests. Koonimo API key added. Real-time queuing visibility.

2026-05-16 20:43:22 +00:00

.env.example

Add env example

2026-05-15 21:07:34 +00:00

.gitignore

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

docker-compose.yml

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

Dockerfile.dashboard

Add Dockerfile.dashboard

2026-05-15 21:34:52 +00:00

Dockerfile.queue

Add Dockerfile.queue

2026-05-15 21:34:49 +00:00

gpu-router-docker.conf

Update Nginx Docker config

2026-05-15 21:35:13 +00:00

gpu-router.conf

Add Nginx router config

2026-05-15 21:07:33 +00:00

litellm_config.yaml

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

MIGRATION_PLAN.md

Add migration plan

2026-05-15 21:07:32 +00:00

README.md

Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

2026-05-16 18:51:50 +00:00

README.md

syslog-harness — Inference API Harness

CT 116 Docker stack for routing local GPU models through a unified OpenAI-compatible API.

Architecture

nginx :80 → router :9000 → GPU backends
                ├─ qwen3.6-35B-A3B (MoE) @ 192.168.68.15:8080
                ├─ qwen3.6-27B-code (Dense) @ 192.168.68.8:8080
                └─ gemma-4-E4B (Light) @ 192.168.68.110:8080

LiteLLM :8081 (fallback) | Dashboard :3000 | Redis :6379 (local)

Deploy

cd /opt/inference-harness
docker compose up -d

Endpoints

URL	Purpose
`/v1/chat/completions`	Inference API (OpenAI-compatible)
`/v1/models`	Available models
`/`	Dashboard (GPU health, routing, agents, timeseries)

Agent API Keys

Agent	Key
Abiba	`sk-syslog-abiba`
Mumuni	`sk-syslog-mumuni`
Tanko	`sk-syslog-tanko`
Koby	`sk-syslog-koby`
Kagenz0	`sk-syslog-kagenz0`