arxiv:2601.22674
Hanxun Yu
JonnyYu828
ยท
AI & ML interests
Multimodal LLMs, Spatial Intelligence, Embodied AI
Recent Activity
upvoted a paper about 24 hours ago
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders authored
a paper
about 1 month ago
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models authored
a paper
about 1 month ago
StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding