Abiba 34fb7516e1 fix: cross-agent GPU spreading prevents hotspot hammering
OLD: checked only if CURRENT agent was on a GPU
  Tanko→MoE, Mumuni also→MoE (didnt see Tanko)

NEW: checks if ANY agent is on a GPU (cross-agent awareness)
  Pass 1: prefer GPUs with 0 agents
  Pass 2: prefer GPU this agent is not already on
  Pass 3: any non-busy GPU

Prevents Tanko+Mumuni piling onto same GPU simultaneously
even when both slots are free. Combined with MoE=1 slot,
guarantees overflow goes to idle Dense.
2026-05-30 12:55:29 +00:00
2026-05-19 15:03:47 +00:00
S
Description
SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy
371 KiB
Languages
Python 97.6%
Shell 1.9%
Dockerfile 0.5%