jokernifty's picture
Update README.md
3230c62 verified
---
license: apache-2.0
language:
- en
library_name: mlx
pipeline_tag: text-generation
base_model: Qwen/Qwen3.5-9B
base_model_relation: adapter
tags:
- mlx
- qwen
- qwen3.5
- lora
- adapter
- sft
- unity
- documentation
- downftuner
---
# Qwen3.5-9b-UnityEngine
A LoRA adapter for [`Qwen/Qwen3.5-9B`](https://huggingface.co/Qwen/Qwen3.5-9B)
fine-tuned with SFT on **Unity Engine** documentation. The base model is
unchanged — this repo contains only the adapter weights, so you load the
base separately and apply the adapter at inference time.
## What this model does
Specialises Qwen/Qwen3.5-9B for Unity Engine-specific questions, quoting API
identifiers, configuration keys, file paths, and version-specific details
verbatim from the official documentation. It is not a general chat model
— for free-form conversation, the unadorned base handles that better.
## How it was built
Trained using **DownFTuner**, a custom local fine-tuning platform built
by [jokernifty](https://huggingface.co/jokernifty).
DownFTuner is currently a private internal tool of jokernifty. If you'd
like access or want to discuss the pipeline, open a discussion on this
model.
## Usage
### With MLX (Apple Silicon, recommended)
```python
from mlx_lm import load, generate
model, tokenizer = load(
"mlx-community/Qwen3.5-9B-MLX-4bit",
adapter_path="jokernifty/Qwen3.5-9b-UnityEngine",
)
print(generate(
model, tokenizer,
prompt="<your Unity Engine question here>",
max_tokens=400,
))
```
### With transformers + PEFT (any platform)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.5-9B", dtype=torch.bfloat16, device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")
model = PeftModel.from_pretrained(base, "jokernifty/Qwen3.5-9b-UnityEngine")
inputs = tokenizer.apply_chat_template(
[{"role": "user", "content": "<your Unity Engine question here>"}],
add_generation_prompt=True, return_tensors="pt",
).to(model.device)
out = model.generate(inputs, max_new_tokens=400)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
```
### As a fused checkpoint
If you'd rather have a single self-contained model:
```bash
python -m mlx_lm.fuse \
--model mlx-community/Qwen3.5-9B-MLX-4bit \
--adapter-path jokernifty/Qwen3.5-9b-UnityEngine \
--save-path ./Qwen3.5-9b-UnityEngine-fused
```
## Limitations
- Knowledge is bounded by the documentation snapshot used for training.
Newer API additions or removals after that date are not reflected.
- Like the base model, this adapter can confabulate confidently. Always
verify code examples against the current upstream docs before shipping.
- The adapter is LoRA only — for tasks outside Unity Engine, you'll see no
improvement (and possibly slight regression) versus the base.
## License
Apache 2.0, inherited from the Qwen/Qwen3.5-9B base. Built by
[jokernifty](https://huggingface.co/jokernifty) using DownFTuner. Please
credit the base model and this adapter when you use it.