Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Md Selim Sarowar's picture
Open to Collab
5 3

Md Selim Sarowar

selim-sarowar
·
  • s-elim
  • selimsarowar

AI & ML interests

Vision Language Action Models, World Models, 5D Robot Manipulation, 3D Computer Vision

Recent Activity

upvoted a paper 4 days ago
Robots Need More than VLA and World Models
published a dataset about 1 month ago
selim-sarowar/SO-101
liked a dataset 2 months ago
RajatDandekar/so101_box_to_bowl_v2
View all activity

Organizations

None yet

upvoted a paper 4 days ago

Robots Need More than VLA and World Models

Paper • 2606.06556 • Published 9 days ago • 26
upvoted 4 papers 3 months ago

GST-VLA: Structured Gaussian Spatial Tokens for 3D Depth-Aware Vision-Language-Action Models

Paper • 2603.09079 • Published Mar 10 • 1

Unified Vision-Language-Action Model

Paper • 2506.19850 • Published Jun 24, 2025 • 28

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 56

VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

Paper • 2602.10098 • Published Feb 10 • 21
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs