Files
inference-harness/router
Abiba fb1d51b93b restructure: routing prioritized by reasoning requirements
Tier 1 (Lightweight): VLM → Dense → MoE     ≤500 tok, 1 turn
Tier 2 (Simple):      VLM → Dense → MoE     ≤15K tok, ≤12 turns (was 10K/10)
Tier 3 (Medium):      Dense → VLM → MoE     ≤25K tok
Tier 4 (Heavy):       MoE → Dense → VLM     >25K tok (MoE PRIMARY workhorse)
Tier 5 (Default):     MoE → Dense → VLM     MoE primary fallback

Target: MoE ~50% (heavy primary), VLM ~25% (raised simple + fallback),
        Dense ~25% (medium primary + heavy fallback)

Removed turn limit from Medium tier — Simple tier handles conversational
requests up to 12 turns now.
2026-05-27 07:22:30 +00:00
..
2026-05-19 15:03:47 +00:00
2026-05-19 15:03:47 +00:00