From Pixels to Words -- Towards Native One-Vision Models at Scale
Haiwen Diao
Paranioar
AI & ML interests
Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model
Recent Activity
liked a dataset about 10 hours ago
agibot-world/AgiBotWorld-Beta liked a dataset about 10 hours ago
agibot-world/AgiBotWorld-Alpha authored a paper 6 days ago
Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion