Commit Graph

7 Commits

Author SHA1 Message Date
Abiba 911fdc9f3f fix: routing priority — MoE first, VLM second, Dense last 2026-05-19 17:38:29 +00:00
Abiba d9d2c213f6 fix: routing — remove turn limit from default tier, no gaps 2026-05-19 17:24:41 +00:00
Abiba 6625892908 feat: redesigned routing tiers — VLM handles more traffic 2026-05-19 17:01:58 +00:00
Abiba fcb99a26c8 revert: remove Ollama endpoints 2026-05-19 16:57:05 +00:00
Abiba 2234d03079 fix: add /v1/props and /v1/models/<id> endpoints 2026-05-19 16:08:58 +00:00
Abiba 5b99b16712 feat: add request queuing to router (replaces hard 503) 2026-05-19 15:55:13 +00:00
Abiba 28fc57c5c7 May 19, 2026: Full harness update
- Model migration: gemma-4-E4B → qwen3.5-9b-vlm
- Dashboard reorder: Usage Over Time + GPU Metrics to top
- Router counter leak fix (gpu_decr in except handler)
- VLM slot upgrade 1→2
- Automated maintenance cron job
- LiteLLM config update
2026-05-19 15:03:47 +00:00