Files
inference-harness/router
Abiba c4ea5e3a98 fix: flip Tier 4 (Heavy) to Dense-first for thermal safety
Dense → MoE → VLM instead of MoE → Dense → VLM.
Combined with MoE at 1 concurrent slot, Dense absorbs all
primary traffic. MoE only activates when Dense saturated.
Prevents Strix Halo from hitting 94C thermal limit.
2026-05-27 00:01:33 +00:00
..
2026-05-19 15:03:47 +00:00
2026-05-19 15:03:47 +00:00