Files
inference-harness/router
Abiba f47c3f3304 feat: latency vs prompt size scatter plot on dashboard
Router: new /metrics/scatter endpoint returns individual data points
(prompt_tokens, inference_ms, model, agent, reason, stream)
for scatter visualization.

Dashboard: new panel showing latency vs prompt size by model.
- Log-scale X axis (prompt tokens) with model color coding
- Dropdown to filter by individual model or view all
- Hover tooltips with details per point
- Auto-refresh every 30s

Enables direct observation of context-length vs latency
relationship — validates routing tier decisions.
2026-05-26 12:18:31 +00:00
..
2026-05-19 15:03:47 +00:00
2026-05-19 15:03:47 +00:00