Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Md Selim Sarowar's picture
Open to Collab
4 1

Md Selim Sarowar

selim-sarowar
·
  • s-elim
  • selimsarowar

AI & ML interests

Vision Language Action Models, World Models, 5D Robot Manipulation, 3D Computer Vision

Recent Activity

authored a paper 1 day ago
GST-VLA: Structured Gaussian Spatial Tokens for 3D Depth-Aware Vision-Language-Action Models
upvoted a paper 1 day ago
GST-VLA: Structured Gaussian Spatial Tokens for 3D Depth-Aware Vision-Language-Action Models
upvoted a paper 2 days ago
Unified Vision-Language-Action Model
View all activity

Organizations

None yet

upvoted a paper 1 day ago

GST-VLA: Structured Gaussian Spatial Tokens for 3D Depth-Aware Vision-Language-Action Models

Paper • 2603.09079 • Published 2 days ago • 1
upvoted 3 papers 2 days ago

Unified Vision-Language-Action Model

Paper • 2506.19850 • Published Jun 24, 2025 • 28

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 54

VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

Paper • 2602.10098 • Published 30 days ago • 19
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs