Qwen3.5-27B for hipfire

Pre-quantized Qwen3.5-27B (DeltaNet hybrid) for hipfire, a Rust-native LLM inference engine for AMD RDNA GPUs.

Files

File	Quant	Size	Min VRAM	Speed (5700 XT)
qwen3.5-27b.q4.hfq	HFQ4	14.3GB	16GB	TBD
qwen3.5-27b.hfq6.hfq	HFQ6	21.4GB	24GB	TBD

GPU Compatibility

GPU	VRAM	HFQ4	HFQ6
RX 5700 XT	8GB	No	No
RX 6800 XT	16GB	Yes	No
RX 7900 XTX	24GB	Yes	Yes
RX 9070	16GB	Yes	No

Usage

# Install hipfire
curl -L https://raw.githubusercontent.com/Kaden-Schutt/hipfire/master/scripts/install.sh | bash

# Pull and run
hipfire pull qwen3.5:27b
hipfire run qwen3.5:27b "Hello"

Quantization Formats

HFQ4: 4-bit, 256-weight groups (0.53 B/w). Best speed.
HFQ6: 6-bit, 256-weight groups (0.78 B/w). Best quality. ~15% slower.

Both include embedded tokenizer and model config.

About hipfire

Rust + HIP inference engine for AMD consumer GPUs (RDNA1–RDNA4). No Python in the hot path. 9x faster than llama.cpp+ROCm on the same hardware.

GitHub: Kaden-Schutt/hipfire
All models: docs/MODELS.md

License

Model weights subject to original Qwen license. hipfire engine: MIT.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for schuttdev/hipfire-qwen3.5-27b

Base model

Qwen/Qwen3.5-27B

Finetuned

(206)

this model