Instructions to use wangpan-ustc/AtlasVA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use wangpan-ustc/AtlasVA with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="wangpan-ustc/AtlasVA")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("wangpan-ustc/AtlasVA", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use wangpan-ustc/AtlasVA with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "wangpan-ustc/AtlasVA" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wangpan-ustc/AtlasVA", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/wangpan-ustc/AtlasVA
- SGLang
How to use wangpan-ustc/AtlasVA with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "wangpan-ustc/AtlasVA" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wangpan-ustc/AtlasVA", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "wangpan-ustc/AtlasVA" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wangpan-ustc/AtlasVA", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use wangpan-ustc/AtlasVA with Docker Model Runner:
docker model run hf.co/wangpan-ustc/AtlasVA
| license: mit | |
| library_name: transformers | |
| pipeline_tag: image-text-to-text | |
| base_model: Qwen/Qwen2.5-VL-3B-Instruct | |
| # AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents | |
| This repository contains the model weights for **AtlasVA**, as presented in the paper [AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents](https://huggingface.co/papers/2605.17933). | |
| **AtlasVA** is a teacher-free visual skill memory framework designed for Vision-Language Model (VLM) agents. It organizes memory into three complementary layers: spatial heatmaps, visual exemplars, and symbolic text skills. By evolving danger and affinity atlases directly from trajectory statistics, AtlasVA provides dense, coordinate-aware guidance for reinforcement learning, unifying perception, memory, and optimization without external LLM supervision. | |
| - **Project Page:** [https://wangpan-ustc.github.io/AtlasvaWeb/](https://wangpan-ustc.github.io/AtlasvaWeb/) | |
| - **Repository:** [https://github.com/wangpan-ustc/AtlasVA](https://github.com/wangpan-ustc/AtlasVA) | |
| - **Paper:** [https://huggingface.co/papers/2605.17933](https://huggingface.co/papers/2605.17933) | |
| ## Model Details | |
| - **Base Model**: Qwen2.5-VL-3B-Instruct | |
| - **Task**: Multimodal agentic decision making (Sokoban, FrozenLake, 3D navigation, robotic manipulation). | |
| - **Memory Layers**: Spatial heatmaps, visual exemplars, and symbolic text skills. | |
| ## Citation | |
| ```bibtex | |
| @article{wang2026atlasva, | |
| title={AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents}, | |
| author={Wang, Pan and Hu, Yihao and Liu, Xiujin and Yang, Jingchu and Wang, Hang and Wen, Zhihao}, | |
| journal={arXiv preprint arXiv:2605.17933}, | |
| year={2026} | |
| } | |
| ``` |