Commit Graph

2 Commits

Author SHA1 Message Date
Abiba f42747d721 feat: performance analytics panel on dashboard
dashboard/dashboard.py (+61 lines):
- New /api/performance endpoint proxying to router metrics/performance
- Performance Analytics row with 4 panels:
  - Latency distribution (p50/p95/p99 per model) with stacked bars
  - Throughput comparison (avg + p50 tokens/sec per model)
  - Routing effectiveness table by reason
  - Agent performance bars with latency
- 1h/24h window toggle, auto-refresh every 15s
- Color-coded per model (purple=MoE, amber=Dense, green=VLM)
2026-05-25 16:58:15 +00:00
Abiba 28fc57c5c7 May 19, 2026: Full harness update
- Model migration: gemma-4-E4B → qwen3.5-9b-vlm
- Dashboard reorder: Usage Over Time + GPU Metrics to top
- Router counter leak fix (gpu_decr in except handler)
- VLM slot upgrade 1→2
- Automated maintenance cron job
- LiteLLM config update
2026-05-19 15:03:47 +00:00