Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis

- Complexity-based routing (MoE default, Dense heavy, Gemma light)
- Per-agent API keys with metrics tracking
- Time-series usage graphs (24h/7d/30d)
- Streaming support (SSE passthrough)
- Unicode cleanup (ASCII-only output)
- Vision support (gemma-4-E4B)
- Tier enforcement (starter/professional/enterprise)
- GPU health monitoring via sidecar polling
- Unified dashboard with line graph
This commit is contained in:
Abiba (pi)
2026-05-16 18:51:50 +00:00
commit 7b6c6aabe1
11 changed files with 749 additions and 0 deletions
+3
View File
@@ -0,0 +1,3 @@
flask==3.1.*
redis==5.2.*
requests==2.32.*