fix: reduce 27B Dense context to 192K to free VRAM
RTX 3090 was at 94.9% VRAM at 262K context. Reduced to 192K (196608), freeing ~2.4GB. VRAM now at 85% with room for active inference.
This commit is contained in:
+1
-1
@@ -27,7 +27,7 @@ GPU_MAX_CONCURRENT = {
|
||||
# Context window sizes (tokens) — used for compaction signals
|
||||
GPU_CONTEXT = {
|
||||
"qwen3.6-35B-A3B": 262144,
|
||||
"qwen3.6-27B-code": 262144,
|
||||
"qwen3.6-27B-code": 196608,
|
||||
"qwen3.5-9b-vlm": 262144,
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user