Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
- Vision-and-Language-Navigation
|
| 12 |
---
|
| 13 |
|
| 14 |
-
Qwen2.5-VL-3B-R2R-low-level
|
| 15 |
|
| 16 |
**Qwen2.5-VL-3B-R2R-low-level** is a Vision-and-Language Navigation (VLN) model fine-tuned from [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) on the [Room-to-Room (R2R)](https://bringmeaspoon.org/) dataset using the Matterport3D (MP3D) simulator. The model is trained using a low-level action space, where it perceives the environment through egocentric RGB images at a resolution of 320x240.
|
| 17 |
|
|
|
|
| 11 |
- Vision-and-Language-Navigation
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Qwen2.5-VL-3B-R2R-low-level
|
| 15 |
|
| 16 |
**Qwen2.5-VL-3B-R2R-low-level** is a Vision-and-Language Navigation (VLN) model fine-tuned from [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) on the [Room-to-Room (R2R)](https://bringmeaspoon.org/) dataset using the Matterport3D (MP3D) simulator. The model is trained using a low-level action space, where it perceives the environment through egocentric RGB images at a resolution of 320x240.
|
| 17 |
|