Trace-Inverter-4B-MLX-8bit

This is an 8-bit MLX conversion of Jackrong/Trace-Inverter-4B, a Qwen3-based trace inversion model.

The model is intended to reconstruct a detailed synthetic reasoning trace from:

Problem + Model final answer + Reasoning Bubbles

The original weights are BF16. This MLX version was converted with mlx-lm using 8-bit affine quantization with group size 64.

Use With MLX

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("omercelik/Trace-Inverter-4B-MLX-8bit")

messages = [
    {
        "role": "system",
        "content": (
            "You are a trace inversion model. Given a problem, a final answer, "
            "and several compressed reasoning bubbles, reconstruct a detailed "
            "reasoning trace that could plausibly lead to the final answer."
        ),
    },
    {
        "role": "user",
        "content": """Problem:
If a pizza needs 10 cups of water, 16 cups of flour, and salt equal to half the flour amount, what is the combined total?

Model final answer:
34 cups.

Reasoning Bubbles:
I need to calculate the salt first because it is defined as half of the flour amount. Then I should add water, flour, and salt together to get the combined total.

Reconstruct the full reasoning trace.""",
    },
]

prompt = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_dict=False,
)

response = generate(
    model,
    tokenizer,
    prompt=prompt,
    max_tokens=512,
    verbose=True,
)

Notes

The source checkpoint stores PEFT-style LoRA-wrapped tensors inside the safetensors files. For MLX compatibility, the LoRA tensors were merged into plain model weights before conversion. The inferred LoRA scale used for the merge was 1.0.

The source model card notes that outputs may occasionally include stray tool tags such as <tool_call>. Post-processing is recommended when generating datasets.

Generated traces are synthetic reasoning traces. They should not be treated as recovered hidden chain-of-thought from any closed model.

Downloads last month
28
Safetensors
Model size
1B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for omercelik/Trace-Inverter-4B-MLX-8bit

Quantized
(1)
this model