Target A100 40GB runtime (AWQ Int4 thinker, no offload) 4cd0c53 verified STBack23 commited on 4 days ago
L4 24GB: max_memory CPU offload + sdpa default for model load 0344903 verified STBack23 commited on 4 days ago
Fix UnboundLocalError in attn_candidates dedup (transformers_qwen.py) 9388174 verified STBack23 commited on 4 days ago