Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis
- Complexity-based routing (MoE default, Dense heavy, Gemma light) - Per-agent API keys with metrics tracking - Time-series usage graphs (24h/7d/30d) - Streaming support (SSE passthrough) - Unicode cleanup (ASCII-only output) - Vision support (gemma-4-E4B) - Tier enforcement (starter/professional/enterprise) - GPU health monitoring via sidecar polling - Unified dashboard with line graph
This commit is contained in:
@@ -0,0 +1,5 @@
|
||||
__pycache__/
|
||||
*.pyc
|
||||
.env
|
||||
redis-data/
|
||||
ssl/
|
||||
Reference in New Issue
Block a user