Yifan Zhang

CyanTransformer

·

BlueSocksFF

AI & ML interests

None yet

Organizations

upvoted a paper 5 months ago

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 37

upvoted 4 papers about 1 year ago

Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better

Paper • 2506.09040 • Published Jun 10, 2025 • 34

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

Paper • 2505.18675 • Published May 24, 2025 • 28

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published Apr 24, 2025 • 41

Decoupled Global-Local Alignment for Improving Compositional Understanding

Paper • 2504.16801 • Published Apr 23, 2025 • 14