halley-ai/Qwen3-Next-80B-A3B-Instruct-MLX-5bit-gs32
Text Generation ⢠80B ⢠Updated ⢠21 ⢠1
Text Generation & Chat Assistants; Model Compression & Quantization (Q4/Q6/Q8, gs32); Inference & Serving (on-prem, low-latency); RAG / Retrieval; Agents & Tool Use; Distillation / LoRA / Fine-tuning