Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Aditya Kumar Singh's picture
4 7 2

Aditya Kumar Singh

rodo
·
http://rodosingh.github.io/
  • rodosingh23
  • rodosingh

AI & ML interests

Multimodal Learning

Recent Activity

upvoted a paper 2 days ago
DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
liked a model 7 months ago
amd/Instella-3B
liked a dataset 9 months ago
amd/TTT-Bench
View all activity

Organizations

AMD's profile picture AIG-GenAI's profile picture

upvoted a paper 2 days ago

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference

Paper • 2602.18846 • Published 20 days ago • 4
upvoted 5 papers 12 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14, 2025 • 21

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17, 2025 • 19

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17, 2025 • 30

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 170
upvoted a collection over 1 year ago

Qwen2.5-Coder

Collection
Code-specific model series based on Qwen2.5 • 38 items • Updated 11 days ago • 356
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs