Qwopus3.5-9B-Coder-MLX-8bit

8-bit MLX conversion of Jackrong/Qwopus3.5-9B-Coder for Apple Silicon.

Original upstream weights, converted to MLX and quantized to 8-bit. No additional fine-tuning or architecture changes.

Use with mlx-lm

pip install -U mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("snsnc/Qwopus3.5-9B-Coder-MLX-8bit")

prompt = "Write a short Python function that returns the factorial of a number."

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
print(response)

Use with mlx-serve

pip install -U mlx-serve
mlx-serve \
  --model hf://snsnc/Qwopus3.5-9B-Coder-MLX-8bit \
  --serve \
  --host 0.0.0.0 \
  --port 11234

Upstream

Downloads last month
61
Safetensors
Model size
9B params
Tensor type
BF16
U32
F32
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for snsnc/Qwopus3.5-9B-Coder-MLX-8bit

Finetuned
Qwen/Qwen3.5-9B
Quantized
(5)
this model