jokernifty
/

Qwen3.5-9b-UnityEngine

Text Generation

Model card Files Files and versions

Qwen3.5-9b-UnityEngine / README.md

jokernifty's picture

Update README.md

3230c62 verified 4 days ago

|

history blame contribute delete

3.13 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: mlx
	pipeline_tag: text-generation
	base_model: Qwen/Qwen3.5-9B
	base_model_relation: adapter
	tags:
	- mlx
	- qwen
	- qwen3.5
	- lora
	- adapter
	- sft
	- unity
	- documentation
	- downftuner
	---

	# Qwen3.5-9b-UnityEngine

	A LoRA adapter for [`Qwen/Qwen3.5-9B`](https://huggingface.co/Qwen/Qwen3.5-9B)
	fine-tuned with SFT on Unity Engine documentation. The base model is
	unchanged — this repo contains only the adapter weights, so you load the
	base separately and apply the adapter at inference time.

	## What this model does

	Specialises Qwen/Qwen3.5-9B for Unity Engine-specific questions, quoting API
	identifiers, configuration keys, file paths, and version-specific details
	verbatim from the official documentation. It is not a general chat model
	— for free-form conversation, the unadorned base handles that better.

	## How it was built

	Trained using DownFTuner, a custom local fine-tuning platform built
	by [jokernifty](https://huggingface.co/jokernifty).

	DownFTuner is currently a private internal tool of jokernifty. If you'd
	like access or want to discuss the pipeline, open a discussion on this
	model.

	## Usage

	### With MLX (Apple Silicon, recommended)

	```python
	from mlx_lm import load, generate

	model, tokenizer = load(
	"mlx-community/Qwen3.5-9B-MLX-4bit",
	adapter_path="jokernifty/Qwen3.5-9b-UnityEngine",
	)

	print(generate(
	model, tokenizer,
	prompt="<your Unity Engine question here>",
	max_tokens=400,
	))
	```

	### With transformers + PEFT (any platform)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	base = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen3.5-9B", dtype=torch.bfloat16, device_map="auto",
	)
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")
	model = PeftModel.from_pretrained(base, "jokernifty/Qwen3.5-9b-UnityEngine")

	inputs = tokenizer.apply_chat_template(
	[{"role": "user", "content": "<your Unity Engine question here>"}],
	add_generation_prompt=True, return_tensors="pt",
	).to(model.device)
	out = model.generate(inputs, max_new_tokens=400)
	print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
	```

	### As a fused checkpoint

	If you'd rather have a single self-contained model:

	```bash
	python -m mlx_lm.fuse \
	--model mlx-community/Qwen3.5-9B-MLX-4bit \
	--adapter-path jokernifty/Qwen3.5-9b-UnityEngine \
	--save-path ./Qwen3.5-9b-UnityEngine-fused
	```

	## Limitations

	- Knowledge is bounded by the documentation snapshot used for training.
	Newer API additions or removals after that date are not reflected.
	- Like the base model, this adapter can confabulate confidently. Always
	verify code examples against the current upstream docs before shipping.
	- The adapter is LoRA only — for tasks outside Unity Engine, you'll see no
	improvement (and possibly slight regression) versus the base.

	## License

	Apache 2.0, inherited from the Qwen/Qwen3.5-9B base. Built by
	[jokernifty](https://huggingface.co/jokernifty) using DownFTuner. Please
	credit the base model and this adapter when you use it.