---
license: apache-2.0
language:
- en
library_name: mlx
pipeline_tag: text-generation
base_model: Qwen/Qwen3.5-9B
base_model_relation: adapter
tags:
- mlx
- qwen
- qwen3.5
- lora
- adapter
- sft
- unity
- documentation
- downftuner
---

# Qwen3.5-9b-UnityEngine

A LoRA adapter for [`Qwen/Qwen3.5-9B`](https://huggingface.co/Qwen/Qwen3.5-9B)
fine-tuned with SFT on **Unity Engine** documentation. The base model is
unchanged — this repo contains only the adapter weights, so you load the
base separately and apply the adapter at inference time.

## What this model does

Specialises Qwen/Qwen3.5-9B for Unity Engine-specific questions, quoting API
identifiers, configuration keys, file paths, and version-specific details
verbatim from the official documentation. It is not a general chat model
— for free-form conversation, the unadorned base handles that better.

## How it was built

Trained using **DownFTuner**, a custom local fine-tuning platform built
by [jokernifty](https://huggingface.co/jokernifty).

DownFTuner is currently a private internal tool of jokernifty. If you'd
like access or want to discuss the pipeline, open a discussion on this
model.

## Usage

### With MLX (Apple Silicon, recommended)

```python
from mlx_lm import load, generate

model, tokenizer = load(
    "mlx-community/Qwen3.5-9B-MLX-4bit",
    adapter_path="jokernifty/Qwen3.5-9b-UnityEngine",
)

print(generate(
    model, tokenizer,
    prompt="<your Unity Engine question here>",
    max_tokens=400,
))
```

### With transformers + PEFT (any platform)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.5-9B", dtype=torch.bfloat16, device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")
model = PeftModel.from_pretrained(base, "jokernifty/Qwen3.5-9b-UnityEngine")

inputs = tokenizer.apply_chat_template(
    [{"role": "user", "content": "<your Unity Engine question here>"}],
    add_generation_prompt=True, return_tensors="pt",
).to(model.device)
out = model.generate(inputs, max_new_tokens=400)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
```

### As a fused checkpoint

If you'd rather have a single self-contained model:

```bash
python -m mlx_lm.fuse \
  --model mlx-community/Qwen3.5-9B-MLX-4bit \
  --adapter-path jokernifty/Qwen3.5-9b-UnityEngine \
  --save-path ./Qwen3.5-9b-UnityEngine-fused
```

## Limitations

- Knowledge is bounded by the documentation snapshot used for training.
  Newer API additions or removals after that date are not reflected.
- Like the base model, this adapter can confabulate confidently. Always
  verify code examples against the current upstream docs before shipping.
- The adapter is LoRA only — for tasks outside Unity Engine, you'll see no
  improvement (and possibly slight regression) versus the base.

## License

Apache 2.0, inherited from the Qwen/Qwen3.5-9B base. Built by
[jokernifty](https://huggingface.co/jokernifty) using DownFTuner. Please
credit the base model and this adapter when you use it.