Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,550 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
**[CVPR'25] Official PyTorch implementation of "[**MobileMamba: Lightweight Multi-Receptive Visual Mamba Network**](https://arxiv.org/pdf/2411.15941)".**
|
| 6 |
+
---
|
| 7 |
+
[](https://arxiv.org/abs/2411.15941)
|
| 8 |
+
[](https://github.com/lewandofskee/MobileMamba)
|
| 9 |
+
|
| 10 |
+
[Haoyang He<sup>1*</sup>](https://scholar.google.com/citations?hl=zh-CN&user=8NfQv1sAAAAJ),
|
| 11 |
+
[Jiangning Zhang<sup>2*</sup>](https://zhangzjn.github.io),
|
| 12 |
+
[Yuxuan Cai<sup>3</sup>](https://scholar.google.com/citations?hl=zh-CN&user=J9lTFAUAAAAJ),
|
| 13 |
+
[Hongxu Chen<sup>1</sup>](https://scholar.google.com/citations?hl=zh-CN&user=uFT3YfMAAAAJ)
|
| 14 |
+
[Xiaobin Hu<sup>2</sup>](https://scholar.google.com/citations?hl=zh-CN&user=3lMuodUAAAAJ),
|
| 15 |
+
|
| 16 |
+
[Zhenye Gan<sup>2</sup>](https://scholar.google.com/citations?user=fa4NkScAAAAJ&hl=zh-CN),
|
| 17 |
+
[Yabiao Wang<sup>2</sup>](https://scholar.google.com/citations?user=xiK4nFUAAAAJ&hl=zh-CN),
|
| 18 |
+
[Chengjie Wang<sup>2</sup>](https://scholar.google.com/citations?hl=zh-CN&user=fqte5H4AAAAJ),
|
| 19 |
+
Yunsheng Wu<sup>2</sup>,
|
| 20 |
+
[Lei Xie<sup>1β </sup>](https://scholar.google.com/citations?hl=zh-CN&user=7ZZ_-m0AAAAJ)
|
| 21 |
+
|
| 22 |
+
<sup>1</sup>College of Control Science and Engineering, Zhejiang University,
|
| 23 |
+
<sup>2</sup>Youtu Lab, Tencent,
|
| 24 |
+
<sup>3</sup>Huazhong University of Science and Technology
|
| 25 |
+
> **Abstract:** Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs. CNNs, with their local receptive fields, struggle to capture long-range dependencies, while Transformers, despite their global modeling capabilities, are limited by quadratic computational complexity in high-resolution scenarios. Recently, state-space models have gained popularity in the visual domain due to their linear computational complexity. Despite their low FLOPs, current lightweight Mamba-based models exhibit suboptimal throughput.
|
| 26 |
+
In this work, we propose the MobileMamba framework, which balances efficiency and performance. We design a three-stage network to enhance inference speed significantly. At a fine-grained level, we introduce the Multi-Receptive Field Feature Interaction MRFFI module, comprising the Long-Range Wavelet Transform-Enhanced Mamba WTE-Mamba, Efficient Multi-Kernel Depthwise Convolution MK-DeConv, and Eliminate Redundant Identity components. This module integrates multi-receptive field information and enhances high-frequency detail extraction. Additionally, we employ training and testing strategies to further improve performance and efficiency.
|
| 27 |
+
MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods which is maximum x21 faster than LocalVim on GPU. Extensive experiments on high-resolution downstream tasks demonstrate that MobileMamba surpasses current efficient models, achieving an optimal balance between speed and accuracy.
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
------
|
| 31 |
+
# Classification results
|
| 32 |
+
### Image Classification for [ImageNet-1K](https://www.image-net.org):
|
| 33 |
+
| Model | FLOPs | #Params | Resolution | Top-1 | Cfg | Log | Model |
|
| 34 |
+
|--------------------------|:-----:|:-------:|:----------:|:-----:|:---------------------------------------------:|:--------------------------------------------------:|:----------------------------------------------------:|
|
| 35 |
+
| MobileMamba-T2 | 255M | 8.8M | 192 x 192 | 71.5 | [cfg](configs/mobilemamba/mobilemamba_t2.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T2/mobilemamba_t2.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T2/mobilemamba_t2.pth) |
|
| 36 |
+
| MobileMamba-T2β | 255M | 8.8M | 192 x 192 | 76.9 | [cfg](configs/mobilemamba/mobilemamba_t2s.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T2s/mobilemamba_t2s.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T2s/mobilemamba_t2s.pth) |
|
| 37 |
+
| MobileMamba-T4 | 413M | 14.2M | 192 x 192 | 76.1 | [cfg](configs/mobilemamba/mobilemamba_t4.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T4/mobilemamba_t4.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T4/mobilemamba_t4.pth) |
|
| 38 |
+
| MobileMamba-T4β | 413M | 14.2M | 192 x 192 | 78.9 | [cfg](configs/mobilemamba/mobilemamba_t4s.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T4s/mobilemamba_t4s.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_T4s/mobilemamba_t4s.pth) |
|
| 39 |
+
| MobileMamba-S6 | 652M | 15.0M | 224 x 224 | 78.0 | [cfg](configs/mobilemamba/mobilemamba_s6.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_S6/mobilemamba_s6.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_S6/mobilemamba_s6.pth) |
|
| 40 |
+
| MobileMamba-S6β | 652M | 15.0M | 224 x 224 | 80.7 | [cfg](configs/mobilemamba/mobilemamba_s6s.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_S6s/mobilemamba_s6s.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_S6s/mobilemamba_s6s.pth) |
|
| 41 |
+
| MobileMamba-B1 | 1080M | 17.1M | 256 x 256 | 79.9 | [cfg](configs/mobilemamba/mobilemamba_b1.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B1/mobilemamba_b1.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B1/mobilemamba_b1.pth) |
|
| 42 |
+
| MobileMamba-B1β | 1080M | 17.1M | 256 x 256 | 82.2 | [cfg](configs/mobilemamba/mobilemamba_b1s.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B1s/mobilemamba_b1s.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B1s/mobilemamba_b1s.pth) |
|
| 43 |
+
| MobileMamba-B2 | 2427M | 17.1M | 384 x 384 | 81.6 | [cfg](configs/mobilemamba/mobilemamba_b2.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B2/mobilemamba_b2.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B2/mobilemamba_b2.pth) |
|
| 44 |
+
| MobileMamba-B2β | 2427M | 17.1M | 384 x 384 | 83.3 | [cfg](configs/mobilemamba/mobilemamba_b2s.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B2s/mobilemamba_b2s.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B2s/mobilemamba_b2s.pth) |
|
| 45 |
+
| MobileMamba-B4 | 4313M | 17.1M | 512 x 512 | 82.5 | [cfg](configs/mobilemamba/mobilemamba_b4.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B4/mobilemamba_b4.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B4/mobilemamba_b4.pth) |
|
| 46 |
+
| MobileMamba-B4β | 4313M | 17.1M | 512 x 512 | 83.6 | [cfg](configs/mobilemamba/mobilemamba_b4s.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B4s/mobilemamba_b4s.txt) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/MobileMamba_B4s/mobilemamba_b4s.pth) |
|
| 47 |
+
|
| 48 |
+
------
|
| 49 |
+
# Downstream Results
|
| 50 |
+
## Object Detection and Instant Segmentation Results
|
| 51 |
+
### Object Detection and Instant Segmentation Performance Based on [Mask-RCNN](https://openaccess.thecvf.com/content_ICCV_2017/papers/He_Mask_R-CNN_ICCV_2017_paper.pdf) for [COCO2017](https://cocodataset.org):
|
| 52 |
+
| Backbone | AP<sup>b</sup> | AP<sup>b</sup><sub>50</sub> | AP<sup>b</sup><sub>75</sub> | AP<sup>b</sup><sub>S</sub> | AP<sup>b</sup><sub>M</sub> | AP<sup>b</sup><sub>L</sub> | AP<sup>m</sup> | AP<sup>m</sup><sub>50</sub> | AP<sup>m</sup><sub>75</sub> | AP<sup>m</sup><sub>S</sub> | AP<sup>m</sup><sub>M</sub> | AP<sup>m</sup><sub>L</sub> | #Params | FLOPs | Cfg | Log | Model |
|
| 53 |
+
|:--------:|:--------------:|:---------------------------:|:---------------------------:|:--------------------------:|:--------------------------:|:--------------------------:|:--------------:|:---------------------------:|:---------------------------:|:--------------------------:|:--------------------------:|:--------------------------:|:-------:|:-----:|:------------------------------------------:|:------------------------------------------:|:--------------------------------------------:|
|
| 54 |
+
| MobileMamba-B1 | 40.6 | 61.8 | 43.8 | 22.4 | 43.5 | 55.9 | 37.4 | 58.9 | 39.9 | 17.1 | 39.9 | 56.4 | 38.0M | 178G | [cfg](downstream/det/configs/mask_rcnn/mask-rcnn_mobilemamba_b1_fpn_1x_coco.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/maskrcnn.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/maskrcnn.pth) |
|
| 55 |
+
|
| 56 |
+
### Object Detection Performance Based on [RetinaNet](https://openaccess.thecvf.com/content_ICCV_2017/papers/Lin_Focal_Loss_for_ICCV_2017_paper.pdf) for [COCO2017](https://cocodataset.org):
|
| 57 |
+
| Backbone | AP | AP<sub>50</sub> | AP<sub>75</sub> | AP<sub>S</sub> | AP<sub>M</sub> | AP<sub>L</sub> | #Params | FLOPs | Cfg | Log | Model |
|
| 58 |
+
|:--------:|:----:|:---------------:|:---------------:|:--------------:|:--------------:|:--------------:|:-------:|:-----:|:-------------------------------------------:|:-------------------------------------------:|:---------------------------------------------:|
|
| 59 |
+
| MobileMamba-B1 | 39.6 | 59.8 | 42.4 | 21.5 | 43.4 | 53.9 | 27.1M | 151G | [cfg](downstream/det/configs/retinanet/retinanet_mobilemamba_b1_fpn_1x_coco.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/retinanet.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/retinanet.pth) |
|
| 60 |
+
|
| 61 |
+
### Object Detection Performance Based on [SSDLite](https://openaccess.thecvf.com/content_ICCV_2019/papers/Howard_Searching_for_MobileNetV3_ICCV_2019_paper.pdf) for [COCO2017](https://cocodataset.org):
|
| 62 |
+
| Backbone | AP | AP<sub>50</sub> | AP<sub>75</sub> | AP<sub>S</sub> | AP<sub>M</sub> | AP<sub>L</sub> | #Params | FLOPs | Cfg | Log | Model |
|
| 63 |
+
|:-------------------:|:----:|:---------------:|:---------------:|:--------------:|:--------------:|:--------------:|:-------:|:-----:|:-----------------------------------------------------------------------------:|:------------------------------:|:-----------------------------------------------:|
|
| 64 |
+
| MobileMamba-B1 | 24.0 | 39.5 | 24.0 | 3.1 | 23.4 | 46.9 | 18.0M | 1.7G | [cfg](downstream/det/configs/ssd/ssdlite_mobilemamba_b1_8gpu_2lr_coco.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/ssdlite.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/ssdlite.pth) |
|
| 65 |
+
| MobileMamba-B1-r512 | 29.5 | 47.7 | 30.4 | 8.9 | 35.0 | 47.0 | 18.0M | 4.4G | [cfg](downstream/det/configs/ssd/ssdlite_mobilemamba_b1_8gpu_2lr_512_coco.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/ssdlite_512.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/det/ssdlite_512.pth) |
|
| 66 |
+
|
| 67 |
+
## Semantic Segmentation Results
|
| 68 |
+
### Semantic Segmentation Based on [Semantic FPN](https://openaccess.thecvf.com/content_CVPR_2019/papers/Kirillov_Panoptic_Feature_Pyramid_Networks_CVPR_2019_paper.pdf) for [ADE20k](http://sceneparsing.csail.mit.edu/):
|
| 69 |
+
| Backbone | aAcc | mIoU | mAcc | #Params | FLOPs | Cfg | Log | Model |
|
| 70 |
+
|:--------:|:----:|:----:|:----:|:-------:|:-----:|:-------------------------------------:|:-------------------------------------:|:---------------------------------------:|
|
| 71 |
+
| MobileMamba-B4 | 79.9 | 42.5 | 53.7 | 19.8M | 5.6G | [cfg](downstream/seg/configs/sem_fpn/fpn_mobilemamba_b4-160k_ade20k-512x512.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/seg/fpn.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/seg/fpn.pth) |
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
### Semantic Segmentation Based on [DeepLabv3](https://arxiv.org/pdf/1706.05587.pdf) for [ADE20k](http://sceneparsing.csail.mit.edu/):
|
| 75 |
+
| Backbone | aAcc | mIoU | mAcc | #Params | FLOPs | Cfg | Log | Model |
|
| 76 |
+
|:--------------:|:----:|:----:|:----:|:-------:|:-----:|:-------------------------------------------:|:-------------------------------------------:|:---------------------------------------------:|
|
| 77 |
+
| MobileMamba-B4 | 76.3 | 36.6 | 47.1 | 23.4M | 4.7G | [cfg](downstream/seg/configs/deeplabv3/deeplabv3_mobilemamba_b4-80k_ade20k-512x512.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/seg/deeplabv3.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/seg/deeplabv3.pth) |
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
### Semantic Segmentation Based on [PSPNet](https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao_Pyramid_Scene_Parsing_CVPR_2017_paper.pdf) for [ADE20k](http://sceneparsing.csail.mit.edu/):
|
| 81 |
+
| Backbone | aAcc | mIoU | mAcc | #Params | FLOPs | Cfg | Log | Model |
|
| 82 |
+
|:--------:|:----:|:----:|:----:|:-------:|:-----:|:----------------------------------------:|:----------------------------------------:|:------------------------------------------:|
|
| 83 |
+
| MobileMamba-B4 | 76.2 | 36.9 | 47.9 | 20.5M | 4.5G | [cfg](downstream/seg/configs/pspnet/pspnet_mobilemamba_b4-80k_ade20k-512x512.py) | [log](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/seg/pspnet.log) | [model](https://huggingface.co/Lewandofski/MobileMamba/blob/main/downstream/seg/pspnet.pth) |
|
| 84 |
+
|
| 85 |
+
------
|
| 86 |
+
# All Pretrained Weights and Logs
|
| 87 |
+
|
| 88 |
+
The model weights and log files for all classification and downstream tasks are available for download via [GoogleDrive](https://drive.google.com/file/d/1EDqWI6JKMaLZRSRWt9aM7VXaNvosStGE/view?usp=drive_link) and [Hugging Face](https://huggingface.co/Lewandofski/MobileMamba/tree/main)..
|
| 89 |
+
|
| 90 |
+
------
|
| 91 |
+
# Classification
|
| 92 |
+
## Environments
|
| 93 |
+
```shell
|
| 94 |
+
pip3 install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
|
| 95 |
+
pip3 install timm==0.6.5 tensorboardX einops torchprofile fvcore==0.1.5.post20221221
|
| 96 |
+
cd model/lib_mamba/kernels/selective_scan && pip install . && cd ../../../..
|
| 97 |
+
git clone https://github.com/NVIDIA/apex && cd apex && pip3 install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ (optional)
|
| 98 |
+
```
|
| 99 |
+
|
| 100 |
+
## Prepare ImageNet-1K Dataset
|
| 101 |
+
Download and extract [ImageNet-1K](http://image-net.org/) dataset in the following directory structure:
|
| 102 |
+
|
| 103 |
+
```
|
| 104 |
+
βββ imagenet
|
| 105 |
+
βββ train
|
| 106 |
+
βββ n01440764
|
| 107 |
+
βββ n01440764_10026.JPEG
|
| 108 |
+
βββ ...
|
| 109 |
+
βββ ...
|
| 110 |
+
βββ train.txt (optional)
|
| 111 |
+
βββ val
|
| 112 |
+
βββ n01440764
|
| 113 |
+
βββ ILSVRC2012_val_00000293.JPEG
|
| 114 |
+
βββ ...
|
| 115 |
+
βββ ...
|
| 116 |
+
βββ val.txt (optional)
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
## Test
|
| 120 |
+
Test with 8 GPUs in one node:
|
| 121 |
+
|
| 122 |
+
<details>
|
| 123 |
+
<summary>
|
| 124 |
+
MobileMamba-T2
|
| 125 |
+
</summary>
|
| 126 |
+
|
| 127 |
+
```
|
| 128 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t2 -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_T2/mobilemamba_t2.pth
|
| 129 |
+
```
|
| 130 |
+
This should give `Top-1: 73.638 (Top-5: 91.422)`
|
| 131 |
+
</details>
|
| 132 |
+
|
| 133 |
+
<details>
|
| 134 |
+
<summary>
|
| 135 |
+
MobileMamba-T2β
|
| 136 |
+
</summary>
|
| 137 |
+
|
| 138 |
+
```
|
| 139 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t2s -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_T2s/mobilemamba_t2s.pth
|
| 140 |
+
```
|
| 141 |
+
This should give `Top-1: 76.934 (Top-5: 93.100)`
|
| 142 |
+
</details>
|
| 143 |
+
|
| 144 |
+
<details>
|
| 145 |
+
<summary>
|
| 146 |
+
MobileMamba-T4
|
| 147 |
+
</summary>
|
| 148 |
+
|
| 149 |
+
```
|
| 150 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t4 -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_T4/mobilemamba_t4.pth
|
| 151 |
+
```
|
| 152 |
+
This should give `Top-1: 76.086 (Top-5: 92.772)`
|
| 153 |
+
</details>
|
| 154 |
+
|
| 155 |
+
<details>
|
| 156 |
+
<summary>
|
| 157 |
+
MobileMamba-T4β
|
| 158 |
+
</summary>
|
| 159 |
+
|
| 160 |
+
```
|
| 161 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t4s -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_T4s/mobilemamba_t4s.pth
|
| 162 |
+
```
|
| 163 |
+
This should give `Top-1: 78.914 (Top-5: 94.160)`
|
| 164 |
+
</details>
|
| 165 |
+
|
| 166 |
+
<details>
|
| 167 |
+
<summary>
|
| 168 |
+
MobileMamba-S6
|
| 169 |
+
</summary>
|
| 170 |
+
|
| 171 |
+
```
|
| 172 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_s6 -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_S6/mobilemamba_s6.pth
|
| 173 |
+
```
|
| 174 |
+
This should give `Top-1: 78.002 (Top-5: 93.992)`
|
| 175 |
+
</details>
|
| 176 |
+
|
| 177 |
+
<details>
|
| 178 |
+
<summary>
|
| 179 |
+
MobileMamba-S6β
|
| 180 |
+
</summary>
|
| 181 |
+
|
| 182 |
+
```
|
| 183 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_s6s -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_S6s/mobilemamba_s6s.pth
|
| 184 |
+
```
|
| 185 |
+
This should give `Top-1: 80.742 (Top-5: 95.182)`
|
| 186 |
+
</details>
|
| 187 |
+
|
| 188 |
+
<details>
|
| 189 |
+
<summary>
|
| 190 |
+
MobileMamba-B1
|
| 191 |
+
</summary>
|
| 192 |
+
|
| 193 |
+
```
|
| 194 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b1 -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_B1/mobilemamba_b1.pth
|
| 195 |
+
```
|
| 196 |
+
This should give `Top-1: 79.948 (Top-5: 94.924)`
|
| 197 |
+
</details>
|
| 198 |
+
|
| 199 |
+
<details>
|
| 200 |
+
<summary>
|
| 201 |
+
MobileMamba-B1β
|
| 202 |
+
</summary>
|
| 203 |
+
|
| 204 |
+
```
|
| 205 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b1s -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_B1s/mobilemamba_b1s.pth
|
| 206 |
+
```
|
| 207 |
+
This should give `Top-1: 82.234 (Top-5: 95.872)`
|
| 208 |
+
</details>
|
| 209 |
+
|
| 210 |
+
<details>
|
| 211 |
+
<summary>
|
| 212 |
+
MobileMamba-B2
|
| 213 |
+
</summary>
|
| 214 |
+
|
| 215 |
+
```
|
| 216 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b2 -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_B2/mobilemamba_b2.pth
|
| 217 |
+
```
|
| 218 |
+
This should give `Top-1: 81.624 (Top-5: 95.890)`
|
| 219 |
+
</details>
|
| 220 |
+
|
| 221 |
+
<details>
|
| 222 |
+
<summary>
|
| 223 |
+
MobileMamba-B2β
|
| 224 |
+
</summary>
|
| 225 |
+
|
| 226 |
+
```
|
| 227 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b2s -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_B2s/mobilemamba_b2s.pth
|
| 228 |
+
```
|
| 229 |
+
This should give `Top-1: 83.260 (Top-5: 96.438)`
|
| 230 |
+
</details>
|
| 231 |
+
|
| 232 |
+
<details>
|
| 233 |
+
<summary>
|
| 234 |
+
MobileMamba-B4
|
| 235 |
+
</summary>
|
| 236 |
+
|
| 237 |
+
```
|
| 238 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b4 -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_B4/mobilemamba_b4.pth
|
| 239 |
+
```
|
| 240 |
+
This should give `Top-1: 82.496 (Top-5: 96.252)`
|
| 241 |
+
</details>
|
| 242 |
+
|
| 243 |
+
<details>
|
| 244 |
+
<summary>
|
| 245 |
+
MobileMamba-B4β
|
| 246 |
+
</summary>
|
| 247 |
+
|
| 248 |
+
```
|
| 249 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b4s -m test model.model_kwargs.checkpoint_path=weights/MobileMamba_B4s/mobilemamba_b4s.pth
|
| 250 |
+
```
|
| 251 |
+
This should give `Top-1: 83.644 (Top-5: 96.606)`
|
| 252 |
+
</details>
|
| 253 |
+
|
| 254 |
+
|
| 255 |
+
## Train
|
| 256 |
+
Train with 8 GPUs in one node:
|
| 257 |
+
|
| 258 |
+
<details>
|
| 259 |
+
<summary>
|
| 260 |
+
MobileMamba-T2
|
| 261 |
+
</summary>
|
| 262 |
+
|
| 263 |
+
```
|
| 264 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t2 -m train
|
| 265 |
+
```
|
| 266 |
+
</details>
|
| 267 |
+
|
| 268 |
+
<details>
|
| 269 |
+
<summary>
|
| 270 |
+
MobileMamba-T2β
|
| 271 |
+
</summary>
|
| 272 |
+
|
| 273 |
+
```
|
| 274 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t2s -m train
|
| 275 |
+
```
|
| 276 |
+
</details>
|
| 277 |
+
|
| 278 |
+
<details>
|
| 279 |
+
<summary>
|
| 280 |
+
MobileMamba-T4
|
| 281 |
+
</summary>
|
| 282 |
+
|
| 283 |
+
```
|
| 284 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t4 -m train
|
| 285 |
+
```
|
| 286 |
+
</details>
|
| 287 |
+
|
| 288 |
+
<details>
|
| 289 |
+
<summary>
|
| 290 |
+
MobileMamba-T4β
|
| 291 |
+
</summary>
|
| 292 |
+
|
| 293 |
+
```
|
| 294 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_t4s -m train
|
| 295 |
+
```
|
| 296 |
+
</details>
|
| 297 |
+
|
| 298 |
+
<details>
|
| 299 |
+
<summary>
|
| 300 |
+
MobileMamba-S6
|
| 301 |
+
</summary>
|
| 302 |
+
|
| 303 |
+
```
|
| 304 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_s6 -m train
|
| 305 |
+
```
|
| 306 |
+
</details>
|
| 307 |
+
|
| 308 |
+
<details>
|
| 309 |
+
<summary>
|
| 310 |
+
MobileMamba-S6β
|
| 311 |
+
</summary>
|
| 312 |
+
|
| 313 |
+
```
|
| 314 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_s6s -m train
|
| 315 |
+
```
|
| 316 |
+
</details>
|
| 317 |
+
|
| 318 |
+
<details>
|
| 319 |
+
<summary>
|
| 320 |
+
MobileMamba-B1
|
| 321 |
+
</summary>
|
| 322 |
+
|
| 323 |
+
```
|
| 324 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b1 -m train
|
| 325 |
+
```
|
| 326 |
+
</details>
|
| 327 |
+
|
| 328 |
+
<details>
|
| 329 |
+
<summary>
|
| 330 |
+
MobileMamba-B1β
|
| 331 |
+
</summary>
|
| 332 |
+
|
| 333 |
+
```
|
| 334 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b1s -m train
|
| 335 |
+
```
|
| 336 |
+
</details>
|
| 337 |
+
|
| 338 |
+
<details>
|
| 339 |
+
<summary>
|
| 340 |
+
MobileMamba-B2
|
| 341 |
+
</summary>
|
| 342 |
+
|
| 343 |
+
```
|
| 344 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b2 -m train
|
| 345 |
+
```
|
| 346 |
+
</details>
|
| 347 |
+
|
| 348 |
+
<details>
|
| 349 |
+
<summary>
|
| 350 |
+
MobileMamba-B2β
|
| 351 |
+
</summary>
|
| 352 |
+
|
| 353 |
+
```
|
| 354 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b2s -m train
|
| 355 |
+
```
|
| 356 |
+
</details>
|
| 357 |
+
|
| 358 |
+
<details>
|
| 359 |
+
<summary>
|
| 360 |
+
MobileMamba-B4
|
| 361 |
+
</summary>
|
| 362 |
+
|
| 363 |
+
```
|
| 364 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b4 -m train
|
| 365 |
+
```
|
| 366 |
+
</details>
|
| 367 |
+
|
| 368 |
+
<details>
|
| 369 |
+
<summary>
|
| 370 |
+
MobileMamba-B4β
|
| 371 |
+
</summary>
|
| 372 |
+
|
| 373 |
+
```
|
| 374 |
+
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --use_env run.py -c configs/mobilemamba/mobilemamba_b4s -m train
|
| 375 |
+
```
|
| 376 |
+
</details>
|
| 377 |
+
|
| 378 |
+
------
|
| 379 |
+
# Down-Stream Tasks
|
| 380 |
+
## Environments
|
| 381 |
+
```shell
|
| 382 |
+
pip3 install terminaltables pycocotools prettytable xtcocotools
|
| 383 |
+
pip3 install mmpretrain==1.2.0 mmdet==3.3.0 mmsegmentation==1.2.2
|
| 384 |
+
pip3 install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1/index.html
|
| 385 |
+
cd det/backbones/lib_mamba/kernels/selective_scan && pip install . && cd ../../../..
|
| 386 |
+
```
|
| 387 |
+
## Prepare COCO and ADE20k Dataset
|
| 388 |
+
Download and extract [COCO2017](https://cocodataset.org) and [ADE20k](http://sceneparsing.csail.mit.edu/) dataset in the following directory structure:
|
| 389 |
+
|
| 390 |
+
```
|
| 391 |
+
downstream
|
| 392 |
+
βββ det
|
| 393 |
+
βββββ data
|
| 394 |
+
β βββββ coco
|
| 395 |
+
β β βββββ annotations
|
| 396 |
+
β β βββββ train2017
|
| 397 |
+
β β βββββ val2017
|
| 398 |
+
β β βββββ test2017
|
| 399 |
+
βββ seg
|
| 400 |
+
βββββ data
|
| 401 |
+
β βββββ ade
|
| 402 |
+
β β βββββ ADEChallengeData2016
|
| 403 |
+
β β βββββββββ annotations
|
| 404 |
+
β β βββββββββ images
|
| 405 |
+
```
|
| 406 |
+
|
| 407 |
+
## Object Detection
|
| 408 |
+
<details>
|
| 409 |
+
<summary>
|
| 410 |
+
Mask-RCNN
|
| 411 |
+
</summary>
|
| 412 |
+
|
| 413 |
+
#### Train:
|
| 414 |
+
|
| 415 |
+
```
|
| 416 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/mask_rcnn/mask-rcnn_mobilemamba_b1_fpn_1x_coco.py 4
|
| 417 |
+
```
|
| 418 |
+
|
| 419 |
+
#### Test:
|
| 420 |
+
|
| 421 |
+
```
|
| 422 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_test.sh configs/mask_rcnn/mask-rcnn_mobilemamba_b1_fpn_1x_coco.py ../../weights/downstream/det/maskrcnn.pth 4
|
| 423 |
+
```
|
| 424 |
+
</details>
|
| 425 |
+
|
| 426 |
+
<details>
|
| 427 |
+
<summary>
|
| 428 |
+
RetinaNet
|
| 429 |
+
</summary>
|
| 430 |
+
|
| 431 |
+
#### Train:
|
| 432 |
+
|
| 433 |
+
```
|
| 434 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/retinanet/retinanet_mobilemamba_b1_fpn_1x_coco.py 4
|
| 435 |
+
```
|
| 436 |
+
|
| 437 |
+
#### Test:
|
| 438 |
+
|
| 439 |
+
```
|
| 440 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_test.sh configs/retinanet/retinanet_mobilemamba_b1_fpn_1x_coco.py ../../weights/downstream/det/retinanet.pth 4
|
| 441 |
+
```
|
| 442 |
+
</details>
|
| 443 |
+
|
| 444 |
+
<details>
|
| 445 |
+
<summary>
|
| 446 |
+
SSDLite
|
| 447 |
+
</summary>
|
| 448 |
+
|
| 449 |
+
#### Train with 320 x 320 resolution:
|
| 450 |
+
|
| 451 |
+
```
|
| 452 |
+
./tools/dist_train.sh configs/ssd/ssdlite_mobilemamba_b1_8gpu_2lr_coco.py 8
|
| 453 |
+
```
|
| 454 |
+
|
| 455 |
+
#### Test with 320 x 320 resolution:
|
| 456 |
+
|
| 457 |
+
```
|
| 458 |
+
./tools/dist_test.sh configs/ssd/ssdlite_mobilemamba_b1_8gpu_2lr_coco.py ../../weights/downstream/det/ssdlite.pth 8
|
| 459 |
+
```
|
| 460 |
+
|
| 461 |
+
#### Train with 512 x 512 resolution:
|
| 462 |
+
```
|
| 463 |
+
./tools/dist_train.sh configs/ssd/ssdlite_mobilemamba_b1_8gpu_2lr_512_coco.py 8
|
| 464 |
+
```
|
| 465 |
+
|
| 466 |
+
#### Test with 512 x 512 resolution:
|
| 467 |
+
|
| 468 |
+
```
|
| 469 |
+
./tools/dist_test.sh configs/ssd/ssdlite_mobilemamba_b1_8gpu_2lr_512_coco.py ../../weights/downstream/det/ssdlite_512.pth 8
|
| 470 |
+
```
|
| 471 |
+
</details>
|
| 472 |
+
|
| 473 |
+
|
| 474 |
+
## Semantic Segmentation
|
| 475 |
+
<details>
|
| 476 |
+
<summary>
|
| 477 |
+
DeepLabV3
|
| 478 |
+
</summary>
|
| 479 |
+
|
| 480 |
+
#### Train:
|
| 481 |
+
|
| 482 |
+
```
|
| 483 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/deeplabv3/deeplabv3_mobilemamba_b4-80k_ade20k-512x512.py 4
|
| 484 |
+
```
|
| 485 |
+
|
| 486 |
+
#### Test:
|
| 487 |
+
|
| 488 |
+
```
|
| 489 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_test.sh configs/deeplabv3/deeplabv3_mobilemamba_b4-80k_ade20k-512x512.py ../../weights/downstream/seg/deeplabv3.pth 4
|
| 490 |
+
```
|
| 491 |
+
</details>
|
| 492 |
+
|
| 493 |
+
<details>
|
| 494 |
+
<summary>
|
| 495 |
+
Semantic FPN
|
| 496 |
+
</summary>
|
| 497 |
+
|
| 498 |
+
#### Train:
|
| 499 |
+
|
| 500 |
+
```
|
| 501 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/sem_fpn/fpn_mobilemamba_b4-160k_ade20k-512x512.py 4
|
| 502 |
+
```
|
| 503 |
+
|
| 504 |
+
#### Test:
|
| 505 |
+
|
| 506 |
+
```
|
| 507 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_test.sh configs/sem_fpn/fpn_mobilemamba_b4-160k_ade20k-512x512.py ../../weights/downstream/seg/fpn.pth 4
|
| 508 |
+
```
|
| 509 |
+
</details>
|
| 510 |
+
|
| 511 |
+
<details>
|
| 512 |
+
<summary>
|
| 513 |
+
PSPNet
|
| 514 |
+
</summary>
|
| 515 |
+
|
| 516 |
+
#### Train:
|
| 517 |
+
|
| 518 |
+
```
|
| 519 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/pspnet/pspnet_mobilemamba_b4-80k_ade20k-512x512.py 4
|
| 520 |
+
```
|
| 521 |
+
|
| 522 |
+
#### Test:
|
| 523 |
+
|
| 524 |
+
```
|
| 525 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_test.sh configs/pspnet/pspnet_mobilemamba_b4-80k_ade20k-512x512.py ../../weights/downstream/seg/pspnet.pth 4
|
| 526 |
+
```
|
| 527 |
+
</details>
|
| 528 |
+
|
| 529 |
+
|
| 530 |
+
# Citation
|
| 531 |
+
If our work is helpful for your research, please consider citing:
|
| 532 |
+
```angular2html
|
| 533 |
+
@article{mobilemamba,
|
| 534 |
+
title={MobileMamba: Lightweight Multi-Receptive Visual Mamba Network},
|
| 535 |
+
author={Haoyang He and Jiangning Zhang and Yuxuan Cai and Hongxu Chen and Xiaobin Hu and Zhenye Gan and Yabiao Wang and Chengjie Wang and Yunsheng Wu and Lei Xie},
|
| 536 |
+
journal={arXiv preprint arXiv:2411.15941},
|
| 537 |
+
year={2024}
|
| 538 |
+
}
|
| 539 |
+
```
|
| 540 |
+
|
| 541 |
+
# Acknowledgements
|
| 542 |
+
We thank but not limited to following repositories for providing assistance for our research:
|
| 543 |
+
- [EMO](https://github.com/zhangzjn/EMO)
|
| 544 |
+
- [EfficientViT](https://github.com/microsoft/Cream/tree/main/EfficientViT)
|
| 545 |
+
- [VMamba](https://github.com/MzeroMiko/VMamba)
|
| 546 |
+
- [TIMM](https://github.com/rwightman/pytorch-image-models)
|
| 547 |
+
- [MMDetection](https://github.com/open-mmlab/mmdetection)
|
| 548 |
+
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation)
|
| 549 |
+
|
| 550 |
+
|