holodeck-parser-qwen35-ft

Fine-tuned LoRA adapter for the Holodeck VR voice command parser.

Built on top of Qwen/Qwen3.5-4B, this model takes a voice transcript and a structured scene context and outputs a single JSON command for a 3D virtual environment engine.

⚠️ Ollama compatibility note: Qwen3.5-4B uses a hybrid SSM+Transformer architecture with a Multi-Token Prediction (MTP) head. Ollama ≤ 0.24.0 cannot load llama.cpp-converted GGUFs of this model due to a GGUF parsing limitation for hybrid architectures. The trained LoRA weights in this repo are fully valid. For local inference, see the GGUF export note below.

Usage via PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-4B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-4B")
model = PeftModel.from_pretrained(base, "trans-realities-lab/holodeck-parser-qwen35-ft")

Input format

Send messages in this structure (system prompt is handled by the tokenizer's chat template):

Transcript: "move the red chair a bit to the left"
User context: {"id": "user_1", "position": {"x": 0, "y": 1.7, "z": 3}, "look_direction": {"x": 0, "y": 0, "z": -1}}
Voice context: {"lockedObjects": [], "fovObjects": [{"nodeId": "server_abc123", "meshName": "Red Chair", "type": "chair", "position": {"x": 2, "y": 0, "z": 1}, "rotation": {"x": 0, "y": 0, "z": 0, "w": 1}, "scale": {"x": 1, "y": 1, "z": 1}}], "raycastHit": null}

Expected output:

{"command": "edit", "id": "server_abc123", "changes": {"position_relative": {"direction": "left", "units": 1}}}

Output schema

Command	Description
`spawn`	Create a new object in the scene
`edit`	Move, rotate, scale, rename, or toggle visibility
`delete`	Remove an object
`none`	No actionable command detected

Edit supports both absolute (position) and relative (position_relative) moves. Relative directions (left, right, forward, back) are relative to the user's look direction; up/down are world-space.

GGUF export for local inference

To convert for use with llama.cpp or Ollama (requires stripping the MTP layer):

# Merge LoRA into base weights first (run inside the finetuning venv)
python scripts/export_gguf.py --model qwen35 --no-gguf  # saves merged safetensors

# Convert with MTP layer stripped so Ollama can load it
python ~/.unsloth/llama.cpp/convert_hf_to_gguf.py models/qwen35-ft-gguf \
    --outtype f16 --outfile models/qwen35-ft-nomtp-f16.gguf --no-mtp

# Quantize
llama-quantize models/qwen35-ft-nomtp-f16.gguf models/qwen35-ft-q4km.gguf Q4_K_M

# Register with Ollama
bash scripts/register_ollama.sh qwen35

Training details

Parameter	Value
Base model	Qwen/Qwen3.5-4B
Method	bf16 LoRA (QLoRA 4-bit not recommended for Qwen3.5 SSM layers)
LoRA rank	16
LoRA alpha	16
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dataset	270 synthetic examples (Holodeck parser training set)
Epochs	3
Learning rate	2e-4
Hardware	NVIDIA RTX 4080 SUPER (16 GB)
Architecture note	Hybrid SSM+Transformer with MTP head — requires transformers≥5.0

Downloads last month: -

Model tree for trans-realities-lab/holodeck-parser-qwen35-ft

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Adapter

(266)

this model

Collection including trans-realities-lab/holodeck-parser-qwen35-ft

Holodeck Parser

Collection

Fine-tuned models for natural language → Unreal Engine command parsing • 3 items • Updated May 18