SmolVLA-SewerBot-v0 (Phase 0 Stub)

First Vision Language Action (VLA) model for confined spaces, such as cleaning sewer tanks in India.

This is NOT a production model. Work in progress, more details coming soon.

Architecture

  • Base: Stub 4-layer MLP (state -> action), NOT SmolVLA
  • Input: 18-dim proprioceptive state (joint_pos + joint_vel + ee_pos + ee_quat + base_pos)
  • Output: 7-DOF action
  • Parameters: ~105K total
  • Training: 20 steps, MSE loss, Adam optimizer

Intended Use

Pipeline validation only. Do not deploy on a real robot.

Source

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading