b7882b2434f2d558acf3fc3906772920089fa3aa
RTX 3090 was at 94.9% VRAM at 262K context. Reduced to 192K (196608), freeing ~2.4GB. VRAM now at 85% with room for active inference.
Description
SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy
Languages
Python
97.6%
Shell
1.9%
Dockerfile
0.5%