main
Cross-agent GPU awareness ensures Tanko+Mumuni never simultaneously hit MoE. Second agent always overflows to Dense/VLM. MoE can safely use its extra VRAM with 2 slots since distinct agents never pile on.
Description
SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy
Languages
Python
97.6%
Shell
1.9%
Dockerfile
0.5%