|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- robotics |
|
|
- diffusion-policy |
|
|
- flow-matching |
|
|
- lerobot |
|
|
- rram |
|
|
--- |
|
|
|
|
|
# FMLP-Policy: Flow Matching MLP for Robotic Control |
|
|
|
|
|
This project explores RRAM-compatible neural network architectures for robotic manipulation policies, replacing UNet with pure MLP (Linear + ReLU only) for deployment on analog RRAM accelerators. |
|
|
|
|
|
## Overview |
|
|
|
|
|
Diffusion Policy achieves SOTA on robotic manipulation but requires 50-100 denoising steps — impractical for RRAM deployment (each step needs ADC/DAC conversion). We explore: |
|
|
|
|
|
1. **Streaming Flow Policy (SFP)** reduces to 1-4 integration steps |
|
|
2. **MLP velocity networks** replaces UNet with RRAM-friendly architecture |
|
|
3. **Quantization + noise tolerance** validates INT8 deployment with device variation |
|
|
|
|
|
## Models |
|
|
|
|
|
| Model | Architecture | Description | |
|
|
|-------|--------------|-------------| |
|
|
| [pusht_diffusion_v3](https://huggingface.co/Liyux/pusht_diffusion_v3) | ResNet18 + UNet | DP baseline, 136 episodes | |
|
|
| [pusht_diffusion_v4](https://huggingface.co/Liyux/pusht_diffusion_v4) | ResNet18 + UNet | DP baseline, 226 episodes | |
|
|
| [pusht_diffusion_v5](https://huggingface.co/Liyux/pusht_diffusion_v5) | ResNet18 + UNet | DP baseline, 255 episodes (best) | |
|
|
| [pusht_sfp_v9](https://huggingface.co/Liyux/pusht_sfp_v9) | ResNet18 + UNet | SFP working baseline | |
|
|
| [pusht_sfp_v14](https://huggingface.co/Liyux/pusht_sfp_v14) | ResNet18 + UNet | SFP with h50/k2/σ1 params | |
|
|
| [pusht_sfp_v15](https://huggingface.co/Liyux/pusht_sfp_v15) | ResNet18 + MLP | SFP with cond_residual MLP (RRAM-compatible) | |
|
|
|
|
|
## Dataset |
|
|
|
|
|
| Dataset | Episodes | Description | |
|
|
|---------|----------|-------------| |
|
|
| [pusht_real_merged](https://huggingface.co/datasets/Liyux/pusht_real_merged) | 255 | Real robot Push-T task, SO-101 arm, 320x240 | |
|
|
|
|
|
## Key Results |
|
|
|
|
|
**Sim (2D Push-T):** |
|
|
- MLP achieves 0.86-0.88 FP32 vs UNet 0.74 |
|
|
- INT8 quantization: Bottleneck128+Skip achieves 0.86 |
|
|
- Noise tolerance: <6% accuracy drop at 10% multiplicative noise |
|
|
|
|
|
**Real Robot:** |
|
|
- DP v5: >90% success rate |
|
|
- SFP v9: >70% success rate |
|
|
- SFP v14/v15: > 80% success rate |
|
|
|
|
|
## Code |
|
|
|
|
|
- [Liyux3/lerobot_MLP-SFP](https://github.com/Liyux3/lerobot_MLP-SFP) |
|
|
|
|
|
## Hardware |
|
|
|
|
|
- Robot: SO-101 (LeRobot compatible) |
|
|
- Camera: HB Camera, top-down, 320x240 @ 30fps |
|
|
- Training: 2x RTX 4090 |
|
|
|
|
|
## Citation |
|
|
|
|
|
Coming soon. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- [LeRobot](https://github.com/huggingface/lerobot) framework |
|
|
- [Streaming Flow Policy](https://arxiv.org/abs/2505.21851) paper |
|
|
- HKU EEE |
|
|
|
|
|
--- |
|
|
|
|
|
*Part of FYP project at The University of Hong Kong, supervised by Prof. Han Wang.* |
|
|
|