Abiba
|
9c31b5d622
|
May 19, 2026: Full harness update
- Model migration: gemma-4-E4B → qwen3.5-9b-vlm
- Dashboard reorder: Usage Over Time + GPU Metrics to top
- Router counter leak fix (gpu_decr in except handler)
- VLM slot upgrade 1→2
- Redis stale key cleanup
- Automated maintenance cron job
- LiteLLM config update
- GPU router config update
- README update
|
2026-05-19 15:03:34 +00:00 |
|
Abiba (pi)
|
808c9d3d13
|
Router: 300s timeout, gpu_decr bugfix. Dashboard: Bootstrap 5 modern redesign with KPI stats, equal-height cards, queue ring. Nginx: 600s timeout.
|
2026-05-16 22:12:21 +00:00 |
|
Abiba (pi)
|
9817fe2ef2
|
Dashboard: clean rebuild with Queue Status ring chart, GPU slot indicators, organized layout (GPU/Queue+Model+Agent/Usage/Live)
|
2026-05-16 21:05:19 +00:00 |
|
Abiba (pi)
|
2db2796e53
|
Dashboard: rename to SyslogAI Harness, GPU bar now shows utilization instead of VRAM
|
2026-05-16 19:26:46 +00:00 |
|
Abiba (pi)
|
7b6c6aabe1
|
Initial commit: CT 116 inference harness — nginx, LiteLLM, router, dashboard, Redis
- Complexity-based routing (MoE default, Dense heavy, Gemma light)
- Per-agent API keys with metrics tracking
- Time-series usage graphs (24h/7d/30d)
- Streaming support (SSE passthrough)
- Unicode cleanup (ASCII-only output)
- Vision support (gemma-4-E4B)
- Tier enforcement (starter/professional/enterprise)
- GPU health monitoring via sidecar polling
- Unified dashboard with line graph
|
2026-05-16 18:51:50 +00:00 |
|