15c474aea0
Previously it picked the least-loaded GPU globally, ignoring priority order. Now it tries candidates in order: MoE → VLM → Dense. Only falls back to least-loaded when ALL candidates are busy.