abiba-bot
  • Joined on 2026-05-16
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-25 16:50:55 +00:00
b849cd3395 feat: per-request performance tracking + /metrics/performance endpoint
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-25 00:31:53 +00:00
b7882b2434 fix: reduce 27B Dense context to 192K to free VRAM
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-23 06:01:00 +00:00
ddde6646de fix: decouple VRAM usage from saturation status
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-23 05:57:29 +00:00
41939104c7 fix: non-blocking GPU health checks + 256K turboquant context upgrade
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-22 09:48:02 +00:00
5116e4b1a7 router: heavy tier Dense→MoE→Light + X-Context-Warning headers (compact_soon/compact_recommended/compact_urgent)
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-22 06:34:18 +00:00
e55bcef21a router: 4 optimizations — saturated flag fix, heavy tier MoE-first, better token est, session tracking
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 21:24:38 +00:00
32bd817e97 fix: heavy tier back to Dense→MoE→VLM (Dense now 98K)
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 21:24:38 +00:00
0983337fdb fix: heavy tier Dense→MoE→VLM
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 21:20:30 +00:00
79965450bb fix: Dense context 65K→98K, parallel restored to 2
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 21:15:25 +00:00
6c829abef5 fix: variable collision (r = Redis vs Response) in stream handler
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 21:13:59 +00:00
28d62e27ba feat: context-aware routing + compaction signals
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 21:13:58 +00:00
6efd5ff51c feat: context-aware routing + compaction signals
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 21:11:37 +00:00
350a90b524 fix: sync tier 4 default threshold to 50000 tokens (was stale at 4000)
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 21:08:20 +00:00
714ebb003e fix: heavy threshold → 50000 tokens, 25 turns
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 21:08:19 +00:00
3156c093d5 fix: heavy threshold → 50000 tokens, 25 turns (agent contexts are huge)
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 20:10:09 +00:00
e90bf0216d fix: raise heavy threshold — 4000→12000 tokens, 8→15 turns
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 20:10:02 +00:00
3cbf38e3e2 fix: raise heavy threshold — 4000→12000 tokens, 8→15 turns
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 19:17:54 +00:00
b67021ac69 docs: complete design documentation — auth, routing tiers, queue, models, maintenance
abiba-bot pushed to main at abiba-bot/inference-harness 2026-05-19 19:15:14 +00:00
5971ceee4e security: reject requests without valid API key (401)
abiba-bot pushed to main at SyslogSolution/syslog-harness 2026-05-19 19:13:55 +00:00
46dda918de security: reject requests without valid API key (401 instead of defaulting to starter)