Replace dual 8B with single 30B-A3B FP8 vision model 706520f KinetoLabs Claude Opus 4.5 commited on 2 days ago
Reduce thinking model max_new_tokens to fix slow inference 0699c5f KinetoLabs Claude Opus 4.5 commited on 2 days ago
Fix multi-GPU compatibility issues (6 locations) d1901ae KinetoLabs Claude Opus 4.5 commited on 2 days ago
Fix multi-GPU support in vendored Qwen3-VL scripts c4bfdfa KinetoLabs Claude Opus 4.5 commited on 2 days ago
Fix embedding/reranker loading with official Qwen3-VL classes 455c786 KinetoLabs Claude Opus 4.5 commited on 2 days ago