VisionNav-3B / README.md
adnankhan-11's picture
Upload README.md with huggingface_hub
26f05b6 verified
metadata
license: apache-2.0
datasets:
  - Bofeee5675/GUI-Net-1M
language:
  - en
base_model:
  - Qwen/Qwen2.5-VL-3B-Instruct
tags:
  - VLM
  - Computer-Use

TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents

Model trained from GUI-Net Dataset

See detail at our Project Page

Model Details

The base model is Qwen/Qwen2.5-VL-3B-Instruct. We fine-tuned base model by Lora.