inference-harness

T

Abiba f47c3f3304 feat: latency vs prompt size scatter plot on dashboard

Router: new /metrics/scatter endpoint returns individual data points
(prompt_tokens, inference_ms, model, agent, reason, stream)
for scatter visualization.

Dashboard: new panel showing latency vs prompt size by model.
- Log-scale X axis (prompt tokens) with model color coding
- Dropdown to filter by individual model or view all
- Hover tooltips with details per point
- Auto-refresh every 30s

Enables direct observation of context-length vs latency
relationship — validates routing tier decisions.

2026-05-26 12:18:31 +00:00

dashboard

feat: latency vs prompt size scatter plot on dashboard

2026-05-26 12:18:31 +00:00

nginx

feat: per-request performance tracking + /metrics/performance endpoint

2026-05-25 16:50:45 +00:00

router

feat: latency vs prompt size scatter plot on dashboard

2026-05-26 12:18:31 +00:00

.gitignore

May 19, 2026: Full harness update