Abiba
|
0983337fdb
|
fix: heavy tier Dense→MoE→VLM
|
2026-05-19 21:24:36 +00:00 |
|
Abiba
|
28d62e27ba
|
feat: context-aware routing + compaction signals
|
2026-05-19 21:13:57 +00:00 |
|
Abiba
|
714ebb003e
|
fix: heavy threshold → 50000 tokens, 25 turns
|
2026-05-19 21:08:18 +00:00 |
|
Abiba
|
e90bf0216d
|
fix: raise heavy threshold — 4000→12000 tokens, 8→15 turns
|
2026-05-19 20:10:07 +00:00 |
|
Abiba
|
5971ceee4e
|
security: reject requests without valid API key (401)
|
2026-05-19 19:15:13 +00:00 |
|
Abiba
|
5f05f46c7c
|
fix: heavy tier — Dense first for reasoning, MoE workhorse, VLM overflow
|
2026-05-19 18:27:24 +00:00 |
|
Abiba
|
911fdc9f3f
|
fix: routing priority — MoE first, VLM second, Dense last
|
2026-05-19 17:38:29 +00:00 |
|
Abiba
|
d9d2c213f6
|
fix: routing — remove turn limit from default tier, no gaps
|
2026-05-19 17:24:41 +00:00 |
|
Abiba
|
6625892908
|
feat: redesigned routing tiers — VLM handles more traffic
|
2026-05-19 17:01:58 +00:00 |
|
Abiba
|
fcb99a26c8
|
revert: remove Ollama endpoints
|
2026-05-19 16:57:05 +00:00 |
|
Abiba
|
2234d03079
|
fix: add /v1/props and /v1/models/<id> endpoints
|
2026-05-19 16:08:58 +00:00 |
|
Abiba
|
5b99b16712
|
feat: add request queuing to router (replaces hard 503)
|
2026-05-19 15:55:13 +00:00 |
|
Abiba
|
28fc57c5c7
|
May 19, 2026: Full harness update
- Model migration: gemma-4-E4B → qwen3.5-9b-vlm
- Dashboard reorder: Usage Over Time + GPU Metrics to top
- Router counter leak fix (gpu_decr in except handler)
- VLM slot upgrade 1→2
- Automated maintenance cron job
- LiteLLM config update
|
2026-05-19 15:03:47 +00:00 |
|