941e8db65e
New 4-tier routing: - TIER 1 (Lightweight): ≤100 words, single-turn → VLM first, fallback Dense - TIER 2 (Simple Conv): ≤1000 tokens, ≤4 turns → VLM preferred, fallback Dense - TIER 3 (Heavy): >4000 tokens, system prompts, >8 turns → Dense→MoE→VLM cascade - TIER 4 (Default): Medium tasks → Dense preferred, MoE default, VLM overflow VLM gets more utilization for simple conversations instead of defaulting everything to MoE.