YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Qwen3-VL-4B Robotics Subtask Prediction (v0)
Fine-tuned Qwen/Qwen3-VL-4B-Instruct for embodied robotics next-subtask prediction.
Task
Given an observation image from a robot workspace and a high-level task instruction, predict the specific subtask the robot should perform next.
Input: Image + "Task: Put the radish in the yellow plate\nCompleted subtasks: approach the radish\nWhat specific subtask should the robot perform next?"
Output: "grasp the radish"
Data
- Source images: lucanunz/alldata_14tasks (492 episodes, 14 tasks, LeRobot format)
- Annotations: shivakanthsujit/alldata14_annotations (stage06: subtask decomposition, stage07: steering commands)
- ~6,352 training samples (3 frames per subtask range, ~492 annotated episodes)
- Images are 256ร256 RGB from the main camera
Training
| Setting | Value |
|---|---|
| Base model | Qwen/Qwen3-VL-4B-Instruct |
| Method | SFT with LoRA (r=32, alpha=16) |
| Epochs | 3 |
| Effective batch | 16 (2 ร 8 grad accum) |
| Learning rate | 2e-4 (cosine, 5% warmup) |
| Precision | bf16 |
| Optimizations | LIGER kernel, fused AdamW, gradient checkpointing |
| Hardware | A10G (24GB VRAM) or A100 |
Launch Training
pip install trl transformers datasets peft accelerate bitsandbytes torch torchvision \
trackio huggingface_hub av qwen-vl-utils Pillow liger-kernel
# Login to HF Hub
huggingface-cli login
# Run training
python train_vlm_subtask.py
Or via HF Jobs:
from huggingface_hub import HfApi
api = HfApi()
# Submit as a job on A10G hardware
Versions
- v0: Direct subtask prediction (this version) โ no reasoning traces
- v1 (planned):
<think>reasoning</think><answer>subtask</answer>format using stage08 rationales
Architecture Notes
- Uses Qwen3-VL (Oct 2025) which has explicit 3D grounding and spatial reasoning capabilities โ ideal for embodied robotics
- LoRA targets: q/k/v/o projection + gate/up/down MLP layers
- System prompt frames the task as embodied robot assistant
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support