Vebbern
/

Qwen2.5-VL-3B-R2R-low-level

Vision-and-Language-Navigation

Model card Files Files and versions

Vebbern commited on May 4, 2025

Commit

fc26d7c

·

verified ·

1 Parent(s): 566d3a1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ tags:
 - Vision-and-Language-Navigation
 ---
-Qwen2.5-VL-3B-R2R-low-level
 **Qwen2.5-VL-3B-R2R-low-level** is a Vision-and-Language Navigation (VLN) model fine-tuned from [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) on the [Room-to-Room (R2R)](https://bringmeaspoon.org/) dataset using the Matterport3D (MP3D) simulator. The model is trained using a low-level action space, where it perceives the environment through egocentric RGB images at a resolution of 320x240.

 - Vision-and-Language-Navigation
 ---
+# Qwen2.5-VL-3B-R2R-low-level
 **Qwen2.5-VL-3B-R2R-low-level** is a Vision-and-Language Navigation (VLN) model fine-tuned from [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) on the [Room-to-Room (R2R)](https://bringmeaspoon.org/) dataset using the Matterport3D (MP3D) simulator. The model is trained using a low-level action space, where it perceives the environment through egocentric RGB images at a resolution of 320x240.