base for:https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled tool:llamacpp
Mixed quantization with BF16 attention weights (Q/K/V, QKV, output) and embeddings for maximum reasoning fidelity, while applying Q4_K_M to FFN and SSM layers for efficient compression โ 24GB at 7.60 BPW.
for 5090 32g vram is nice
- Downloads last month
- 713
Hardware compatibility
Log In to add your hardware
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support