holodeck-parser-qwen35-ft

Fine-tuned LoRA adapter for the Holodeck VR voice command parser.

Built on top of Qwen/Qwen3.5-4B, this model takes a voice transcript and a structured scene context and outputs a single JSON command for a 3D virtual environment engine.

⚠️ Ollama compatibility note: Qwen3.5-4B uses a hybrid SSM+Transformer architecture with a Multi-Token Prediction (MTP) head. Ollama ≤ 0.24.0 cannot load llama.cpp-converted GGUFs of this model due to a GGUF parsing limitation for hybrid architectures. The trained LoRA weights in this repo are fully valid. For local inference, see the GGUF export note below.

Usage via PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-4B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-4B")
model = PeftModel.from_pretrained(base, "trans-realities-lab/holodeck-parser-qwen35-ft")

Input format

Send messages in this structure (system prompt is handled by the tokenizer's chat template):

Transcript: "move the red chair a bit to the left"
User context: {"id": "user_1", "position": {"x": 0, "y": 1.7, "z": 3}, "look_direction": {"x": 0, "y": 0, "z": -1}}
Voice context: {"lockedObjects": [], "fovObjects": [{"nodeId": "server_abc123", "meshName": "Red Chair", "type": "chair", "position": {"x": 2, "y": 0, "z": 1}, "rotation": {"x": 0, "y": 0, "z": 0, "w": 1}, "scale": {"x": 1, "y": 1, "z": 1}}], "raycastHit": null}

Expected output:

{"command": "edit", "id": "server_abc123", "changes": {"position_relative": {"direction": "left", "units": 1}}}

Output schema

Command Description
spawn Create a new object in the scene
edit Move, rotate, scale, rename, or toggle visibility
delete Remove an object
none No actionable command detected

Edit supports both absolute (position) and relative (position_relative) moves. Relative directions (left, right, forward, back) are relative to the user's look direction; up/down are world-space.

GGUF export for local inference

To convert for use with llama.cpp or Ollama (requires stripping the MTP layer):

# Merge LoRA into base weights first (run inside the finetuning venv)
python scripts/export_gguf.py --model qwen35 --no-gguf  # saves merged safetensors

# Convert with MTP layer stripped so Ollama can load it
python ~/.unsloth/llama.cpp/convert_hf_to_gguf.py models/qwen35-ft-gguf \
    --outtype f16 --outfile models/qwen35-ft-nomtp-f16.gguf --no-mtp

# Quantize
llama-quantize models/qwen35-ft-nomtp-f16.gguf models/qwen35-ft-q4km.gguf Q4_K_M

# Register with Ollama
bash scripts/register_ollama.sh qwen35

Training details

Parameter Value
Base model Qwen/Qwen3.5-4B
Method bf16 LoRA (QLoRA 4-bit not recommended for Qwen3.5 SSM layers)
LoRA rank 16
LoRA alpha 16
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dataset 270 synthetic examples (Holodeck parser training set)
Epochs 3
Learning rate 2e-4
Hardware NVIDIA RTX 4080 SUPER (16 GB)
Architecture note Hybrid SSM+Transformer with MTP head — requires transformers≥5.0
Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for trans-realities-lab/holodeck-parser-qwen35-ft

Finetuned
Qwen/Qwen3.5-4B
Adapter
(195)
this model

Collection including trans-realities-lab/holodeck-parser-qwen35-ft