SiMO / README.md
DempseyWen's picture
Update README.md
923f2ae verified
---
license: mit
tags:
- computer-vision
- 3d-object-detection
- autonomous-driving
- multimodal-fusion
- collaborative-perception
- lidar
- camera
- opv2v
- v2xset
- dair-v2x
language:
- en
pipeline_tag: object-detection
---
# SiMO: Single-Modal-Operable Multimodal Collaborative Perception
This repository contains pretrained checkpoints for **SiMO** (Single-Modal-Operable Multimodal Collaborative Perception), a novel framework for robust multimodal collaborative 3D object detection in autonomous driving.
## πŸ“œ Paper
**Title**: Single-Modal-Operable Multimodal Collaborative Perception
**Conference**: ICLR 2026
**OpenReview**: [Link](https://openreview.net/forum?id=h0iRgjTmVs)
**ArXiv**:[Link](https://arxiv.org/abs/2603.08240)
## πŸš€ Key Features
- **Single-Modal Operability**: Maintains functional performance when one modality fails
- **LAMMA Fusion**: Length-Adaptive Multi-Modal Fusion module
- **PAFR Training**: Pretrain-Align-Fuse-Random Drop training strategy
- **Graceful Degradation**: >80% AP@30 with camera-only operation
## πŸ“¦ Available Models
| Model | Dataset | Architecture | Checkpoint |
|-------|---------|--------------|------------|
| SiMO-PF | OPV2V-H | Pyramid Fusion + LAMMA | [Download](https://huggingface.co/DempseyWen/SiMO/blob/main/SiMO_PF/net_epoch27.pth) |
| SiMO-AttFuse | OPV2V-H | AttFusion + LAMMA | [Download](https://huggingface.co/DempseyWen/SiMO/blob/main/SiMO_AF/net_epoch21.pth) |
## πŸ“Š Performance
### OPV2V-H (with Random Drop)
| Modality | AP@30 | AP@50 | AP@70 |
|----------|-------|-------|-------|
| LiDAR + Camera | 98.30 | 97.94 | 94.64 |
| LiDAR-only | 97.32 | 97.07 | 94.06 |
| Camera-only | 80.81 | 69.63 | 44.82 |
## πŸ’» Usage
### Installation
```bash
git clone https://github.com/dempsey-wen/SiMO.git
cd SiMO
pip install -r requirements.txt
```
### Download Checkpoint
```bash
# Install huggingface-hub
pip install huggingface-hub
# Download specific checkpoint
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='DempseyWen/SiMO', filename='***.pth')"
```
## πŸ“– Full Documentation
For complete documentation, training scripts, and data preparation instructions, please visit our [GitHub repository](https://github.com/dempsey-wen/SiMO).
## 🏒 Acknowledgements
This work builds upon:
- [OpenCOOD](https://github.com/DerrickXuNu/OpenCOOD)
- [HEAL](https://github.com/yifanlu0227/HEAL)
## πŸ“„ Citation
If you find this work useful, please cite:
```bibtex
@inproceedings{wen2026simo,
title={Single-Modal-Operable Multimodal Collaborative Perception},
author={Wen, Dempsey and Lu, Yifan and others},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}
```
## πŸ“„ License
MIT License - see [LICENSE](https://github.com/dempsey-wen/SiMO/blob/main/LICENSE) for details.