zhanghang's picture

zhanghang

hangzhang-nlp

·

hangzhang-nlp

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

MMSkills: Towards Multimodal Skills for General Visual Agents

upvoted a paper 6 months ago

Qwen3-VL Technical Report

liked a model 7 months ago

Qwen/Qwen3-VL-2B-Thinking

View all activity

Organizations

upvoted a paper 5 days ago

MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 10 days ago • 117

upvoted a paper 6 months ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 162

liked 13 models 7 months ago

Qwen/Qwen3-VL-2B-Thinking

Image-Text-to-Text • 2B • Updated Oct 20, 2025 • 60.9k • 114

Qwen/Qwen3-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Oct 23, 2025 • 61.6M • 411

Qwen/Qwen3-VL-4B-Instruct

Image-Text-to-Text • 4B • Updated Oct 15, 2025 • 3.11M • 389

Qwen/Qwen3-VL-4B-Thinking

Image-Text-to-Text • 4B • Updated Oct 15, 2025 • 1.58M • 110

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 7.64M • • 912

Qwen/Qwen3-VL-8B-Thinking

Image-Text-to-Text • 9B • Updated Nov 26, 2025 • 626k • 209

Qwen/Qwen3-VL-30B-A3B-Instruct-FP8

Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 317k • 110

Qwen/Qwen3-VL-30B-A3B-Instruct

Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 1.22M • • 574

Qwen/Qwen3-VL-30B-A3B-Thinking

Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 22.1k • • 198

Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 252k • 44

Qwen/Qwen3-VL-235B-A22B-Thinking-FP8

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 5.64k • 28

Qwen/Qwen3-VL-235B-A22B-Instruct

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 1.66M • • 389

Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 8.82k • • 396

liked a Space 11 months ago

VideoRefer VideoLLaMA3

VideoRefer x VideoLLaMA3

upvoted a paper 12 months ago

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published Jun 8, 2025 • 114

upvoted 2 papers about 1 year ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 309

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Paper • 2406.07476 • Published Jun 11, 2024 • 36

upvoted a paper over 1 year ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19, 2025 • 218