wzh
hg2wzh
AI & ML interests
None yet
Recent Activity
liked a dataset 3 days ago
nvidia/Nemotron-Image-Training-v3 liked a model 5 days ago
jinaai/jina-embeddings-v5-omni-nano-retrieval upvoted a collection 5 days ago
jina-embeddings-v5-omniOrganizations
None yet
Datasets
Embedding
VLMs
-
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 79 -
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper • 2412.08635 • Published • 49 -
AIDC-AI/Ovis2-2B
Image-Text-to-Text • Updated • 1.08k • 60 -
DAMO-NLP-SG/VideoLLaMA3-2B
Video-Text-to-Text • 2B • Updated • 5.49k • 21
Eval-bench
Text-to-Image
Datasets
Reasoning
Embedding
CLIP series
VLMs
-
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 79 -
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper • 2412.08635 • Published • 49 -
AIDC-AI/Ovis2-2B
Image-Text-to-Text • Updated • 1.08k • 60 -
DAMO-NLP-SG/VideoLLaMA3-2B
Video-Text-to-Text • 2B • Updated • 5.49k • 21
LLMs