34fb7516e13947405f05b16719f194573f7e1334
OLD: checked only if CURRENT agent was on a GPU Tanko→MoE, Mumuni also→MoE (didnt see Tanko) NEW: checks if ANY agent is on a GPU (cross-agent awareness) Pass 1: prefer GPUs with 0 agents Pass 2: prefer GPU this agent is not already on Pass 3: any non-busy GPU Prevents Tanko+Mumuni piling onto same GPU simultaneously even when both slots are free. Combined with MoE=1 slot, guarantees overflow goes to idle Dense.
Description
SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy
Languages
Python
97.6%
Shell
1.9%
Dockerfile
0.5%