Fix hidden_size: 4096 -> 3584 to match Qwen2.5-Coder-7B-Instruct 691fc84 Faaz commited on 30 days ago
Fix Phase 2: fusion layer processes text-only via learnable residual gate for gradient flow 4e9835e Faaz commited on about 1 month ago
Fix: register LLM as nn.Module submodule so optimizer finds LoRA params cdc806e Faaz commited on about 1 month ago
Add GPU diagnostic script, fix architecture loading with low_cpu_mem_usage and sync 5fb9ec3 Faaz commited on about 1 month ago