abiba-bot
  • Joined on 2026-05-16
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 18:27:27 +00:00
5f05f46c7c fix: heavy tier — Dense first for reasoning, MoE workhorse, VLM overflow
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 18:20:22 +00:00
7a78c0f98d fix: heavy tier — Dense first (best for reasoning), then MoE, then VLM
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 18:18:01 +00:00
15c474aea0 fix: select_best_gpu respects candidate order — first non-busy wins
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 17:38:31 +00:00
911fdc9f3f fix: routing priority — MoE first, VLM second, Dense last
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 17:38:24 +00:00
bfc38f5436 fix: routing priority — MoE first, VLM second, Dense last (slow)
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 17:24:44 +00:00
d9d2c213f6 fix: routing — remove turn limit from default tier, no gaps
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 17:19:31 +00:00
f519a3fa60 fix: routing — system prompts no longer force heavy tier
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 17:02:03 +00:00
6625892908 feat: redesigned routing tiers — VLM handles more traffic
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 17:02:00 +00:00
941e8db65e feat: redesigned routing tiers — VLM handles more traffic
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 16:57:07 +00:00
fcb99a26c8 revert: remove Ollama endpoints
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 16:57:06 +00:00
241de4f38c revert: remove Ollama endpoints (llama.cpp uses OpenAI format, not Ollama)
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 16:09:00 +00:00
2234d03079 fix: add /v1/props and /v1/models/<id> endpoints
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 16:08:27 +00:00
beb2d1790a fix: add /v1/props and /v1/models/<id> Ollama-compatible endpoints
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 15:55:16 +00:00
5b99b16712 feat: add request queuing to router (replaces hard 503)
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 15:55:07 +00:00
f2f8e8c921 feat: add request queuing to router (replaces hard 503 on saturation)
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 15:48:41 +00:00
76ade81fda docs: add Koonimo to agent API keys table
abiba-bot created branch main in abiba-bot/inference-harness 2026-05-19 15:28:36 +00:00
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 15:28:36 +00:00
28fc57c5c7 May 19, 2026: Full harness update
abiba-bot created repository abiba-bot/inference-harness 2026-05-19 15:28:26 +00:00
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 15:27:37 +00:00
9c31b5d622 May 19, 2026: Full harness update