Qwen3.5-27B for hipfire
Pre-quantized Qwen3.5-27B (DeltaNet hybrid) for hipfire, a Rust-native LLM inference engine for AMD RDNA GPUs.
Quantized from Qwen/Qwen3.5-27B.
Files
| File | Quant | Size | Min VRAM | Speed (5700 XT) |
|---|---|---|---|---|
| qwen3.5-27b.q4.hfq | HFQ4 | 14.3GB | 16GB | TBD |
| qwen3.5-27b.hfq6.hfq | HFQ6 | 21.4GB | 24GB | TBD |
GPU Compatibility
| GPU | VRAM | HFQ4 | HFQ6 |
|---|---|---|---|
| RX 5700 XT | 8GB | No | No |
| RX 6800 XT | 16GB | Yes | No |
| RX 7900 XTX | 24GB | Yes | Yes |
| RX 9070 | 16GB | Yes | No |
Usage
# Install hipfire
curl -L https://raw.githubusercontent.com/Kaden-Schutt/hipfire/master/scripts/install.sh | bash
# Pull and run
hipfire pull qwen3.5:27b
hipfire run qwen3.5:27b "Hello"
Quantization Formats
- HFQ4: 4-bit, 256-weight groups (0.53 B/w). Best speed.
- HFQ6: 6-bit, 256-weight groups (0.78 B/w). Best quality. ~15% slower.
Both include embedded tokenizer and model config.
About hipfire
Rust + HIP inference engine for AMD consumer GPUs (RDNA1–RDNA4). No Python in the hot path. 9x faster than llama.cpp+ROCm on the same hardware.
- GitHub: Kaden-Schutt/hipfire
- All models: docs/MODELS.md
License
Model weights subject to original Qwen license. hipfire engine: MIT.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for schuttdev/hipfire-qwen3.5-27b
Base model
Qwen/Qwen3.5-27B