This repository contains fine-tuned checkpoints for MME-VLA-Suite.

We train 14 VLA variants in total, all based on the $\pi_{0.5}$ model.

Name Memory type Memory representation Memory integration AVG Success Released
symbolic-simple-subgoal symbolic simple subgoal language concatenation 29.00 βœ…
symbolic-grounded-subgoal symbolic grounded subgoal language concatenation 33.06 βœ…
perceptual-tokendrop-context perceptual token dropping memory-as-context 34.50 βœ…
perceptual-tokendrop-modul perceptual token dropping memory-as-modulation 38.04 βœ…
perceptual-tokendrop-expert perceptual token dropping memory-as-expert 34.86 βœ…
perceptual-framesamp-context perceptual frame sampling memory-as-context 30.68 βœ…
perceptual-framesamp-modul perceptual frame sampling memory-as-modulation 44.51 βœ…
perceptual-framesamp-expert perceptual frame sampling memory-as-expert 36.25 βœ…
recurrent-ttt-context recurrent TTT memory-as-context 22.28 βœ…
recurrent-ttt-modul recurrent TTT memory-as-modulation 21.97 ❌
recurrent-ttt-expert recurrent TTT memory-as-expert 22.35 βœ…
recurrent-rmt-context recurrent RMT memory-as-context 19.46 ❌
recurrent-rmt-modul recurrent RMT memory-as-modulation 20.17 ❌
recurrent-rmt-expert recurrent RMT memory-as-expert 18.15 ❌

We release all symbolic and perceptual memory MME-VLA variants for research.
Due to recurrent memory currently underperforms, we release only a subset. We will release newer recurrent variants once we obtain better results.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including Yinpei/mme_vla_suite