4 7 2

Aditya Kumar Singh

rodo

http://rodosingh.github.io/

AI & ML interests

Multimodal Learning

Recent Activity

upvoted a paper about 2 months ago

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference

liked a model 9 months ago

amd/Instella-3B

liked a dataset 11 months ago

amd/TTT-Bench

View all activity

Organizations

upvoted a paper about 2 months ago

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference

Paper • 2602.18846 • Published Feb 21 • 4

liked a model 9 months ago

amd/Instella-3B

Text Generation • Updated Nov 14, 2025 • 363 • 41

liked a dataset 11 months ago

amd/TTT-Bench

Viewer • Updated Jun 23, 2025 • 412 • 209 • 1

commented 4 papers about 1 year ago

upvoted 5 papers about 1 year ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14, 2025 • 21

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17, 2025 • 20

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17, 2025 • 30

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 172

published a Space about 1 year ago

Vlm

🐠

env for vlm training

upvoted a collection over 1 year ago

Qwen2.5-Coder

Collection

Code-specific model series based on Qwen2.5 • 38 items • Updated Mar 2 • 363

Aditya Kumar Singh

AI & ML interests

Recent Activity

Organizations

rodo's activity

Vlm