6efd5ff51c
- Added GPU_CONTEXT map (MoE 131K, VLM 131K, Dense 65K) - Heavy tier now prefers MoE/VLM (131K) over Dense (65K) for large requests - Response headers: X-Context-Remaining, X-Context-Model - Routing data includes context_remaining field - Agents can use this to trigger compaction when nearing limits