Qwen3-ForcedAligner-0.6B — MLX bf16

Full-precision (bf16→float16) conversion of Qwen/Qwen3-ForcedAligner-0.6B for Apple Silicon inference via MLX.

No quantization applied — maximum accuracy for word-level timestamp prediction.

Model Details

Detail Value
Audio encoder 24 layers, 1024 dim, 16 heads, float16
Text decoder 28 layers, 1024 hidden, 16Q/8KV heads, float16
Classify head Linear(1024, 5000), float16
Timestamp resolution 80ms per class
Total size ~1.8 GB

Usage

let aligner = try await Qwen3ForcedAligner.fromPretrained(
    modelId: "aufklarer/Qwen3-ForcedAligner-0.6B-bf16"
)
let aligned = aligner.align(
    audio: samples, text: "Hello world", sampleRate: 24000
)

Variants

Variant Size Model ID
4-bit ~979 MB aufklarer/Qwen3-ForcedAligner-0.6B-4bit
8-bit ~1.4 GB aufklarer/Qwen3-ForcedAligner-0.6B-8bit
bf16 ~1.8 GB aufklarer/Qwen3-ForcedAligner-0.6B-bf16

Links



Downloads last month
48
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/Qwen3-ForcedAligner-0.6B-bf16

Finetuned
(6)
this model

Collection including aufklarer/Qwen3-ForcedAligner-0.6B-bf16