Abiba c4ea5e3a98 fix: flip Tier 4 (Heavy) to Dense-first for thermal safety
Dense → MoE → VLM instead of MoE → Dense → VLM.
Combined with MoE at 1 concurrent slot, Dense absorbs all
primary traffic. MoE only activates when Dense saturated.
Prevents Strix Halo from hitting 94C thermal limit.
2026-05-27 00:01:33 +00:00
2026-05-19 15:03:47 +00:00
S
Description
SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy
371 KiB
Languages
Python 97.6%
Shell 1.9%
Dockerfile 0.5%