PyTorch
LIU1712 commited on
Commit
b1274c7
·
verified ·
1 Parent(s): 25f6753

Upload 5 files

Browse files
Pretrain_1K_S.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:500747ef072bc15c441c4db4c14d49d1611165707e79a667bd836a685a899772
3
+ size 307815239
Pretrain_mini_B.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e717a01a944f19cc8689f4e87f2d2fcd11d0bfa06f0c368fe0d1d0ea906926a8
3
+ size 1343778951
Pretrain_mini_L.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4cc671e9a67cfe10301681698912fd245f1913845780cab4944405333e9bf486
3
+ size 3955103761
Pretrain_mini_S.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e940d04dd3e1b8c69a4b8550f49e4fa4b66e6530636ad557d6ca7a0ed86ce252
3
+ size 307815239
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - masked-image-modeling
5
+ - self-supervised-learning
6
+ - visual-representation-learning
7
+ - vision-transformer
8
+ - mae
9
+ - pytorch
10
+ ---
11
+
12
+ <a id="top"></a>
13
+ <div align="center">
14
+ <h1> CurMIM: Curriculum Masked Image Modeling</h1>
15
+
16
+ <p>
17
+ <b>Hao Liu</b><sup>1</sup>&nbsp;
18
+ <b>Kun Wang</b><sup>1</sup>&nbsp;
19
+ <b>Yudong Han</b><sup>1</sup>&nbsp;
20
+ <b>Haocong Wang</b><sup>1</sup>&nbsp;
21
+ <b>Yupeng Hu</b><sup>1</sup>&nbsp;
22
+ <b>Chunxiao Wang</b><sup>2</sup>&nbsp;
23
+ <b>Liqiang Nie</b><sup>3</sup>
24
+ </p>
25
+
26
+ <p>
27
+ <sup>1</sup>School of Software, Shandong University, Jinan, China<br>
28
+ <sup>2</sup>Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China<br>
29
+ <sup>3</sup>School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
30
+ </p>
31
+ </div>
32
+
33
+ This is the official PyTorch implementation of **CurMIM**, a curriculum-based masked image modeling framework for self-supervised visual representation learning.
34
+
35
+ 🔗 **Paper:** [CurMIM: Curriculum Masked Image Modeling](https://ieeexplore.ieee.org/document/10890877)
36
+ 🔗 **GitHub Repository:** [iLearn-Lab/ICASSP25-CurMIM](https://github.com/iLearn-Lab/ICASSP25-CurMIM)
37
+
38
+ ---
39
+
40
+ ## Model Information
41
+
42
+ ### 1. Model Name
43
+ **CurMIM** (**Cur**riculum **M**asked **I**mage **M**odeling).
44
+
45
+ ### 2. Task Type & Applicable Tasks
46
+ - **Task Type:** Masked Image Modeling (MIM) / Self-Supervised Visual Representation Learning / Vision Transformer Pretraining
47
+ - **Applicable Tasks:** Curriculum-based masked image pretraining, visual representation learning, finetuning, and linear probing for image classification.
48
+
49
+ ### 3. Project Introduction
50
+ Masked Image Modeling (MIM) usually adopts a fixed masking strategy during pretraining. **CurMIM** introduces a curriculum-style masking strategy that progressively adjusts masking behavior, enabling the model to learn from easier to harder reconstruction targets and thereby improving representation quality.
51
+
52
+ The repository provides a complete workflow for **pretraining**, **finetuning**, and **linear probing**, together with utilities for distributed training and experiment management.
53
+
54
+ ### 4. Training Data Source
55
+ The model follows the dataset preparation protocol of [MAE](https://github.com/facebookresearch/mae) and is mainly designed for:
56
+ - **ImageNet**
57
+ - **miniImageNet**
58
+
59
+ ---
60
+
61
+ ## Usage & Basic Inference
62
+
63
+ This codebase provides scripts for curriculum-based MIM pretraining, finetuning, and linear probing.
64
+
65
+ ### Step 1: Prepare the Environment
66
+ Clone the GitHub repository and install dependencies:
67
+ ```bash
68
+ git clone https://github.com/iLearn-Lab/ICASSP25-CurMIM.git
69
+ cd CurMIM
70
+ python -m venv .venv
71
+ source .venv/bin/activate # Linux / Mac
72
+ # .venv\Scripts\activate # Windows
73
+ pip install torch torchvision timm==0.3.2 tensorboard
74
+ ```
75
+
76
+ ### Step 2: Download Model Weights & Data
77
+ Follow [MAE](https://github.com/facebookresearch/mae)'s dataset preparation for [ImageNet](https://www.image-net.org/).
78
+
79
+
80
+ ### Step 3: Run Testing / Inference
81
+ To pretrain the model, run:
82
+ ```bash
83
+ python -m torch.distributed.launch --nproc_per_node {GPU_number} ./main_pretrain.py --batch_size 128 \
84
+ --accum_iter 2 \
85
+ --model {model_type} \
86
+ --mask_ratio 0.75 --epochs 300 --warmup_epochs 40 \
87
+ --blr 4e-4 --weight_decay 0.05 \
88
+ --data_path ../path --output_dir ./output_dir/
89
+ ```
90
+
91
+ To finetune the model, run:
92
+ ```bash
93
+ python -m torch.distributed.launch --nproc_per_node={GPU_number} ./main_finetune.py \
94
+ --batch_size 128 \
95
+ --nb_classes {nb_classes} \
96
+ --model {model_type} \
97
+ --finetune ./checkpoint.pth \
98
+ --epochs 100 \
99
+ --blr 1e-3 --layer_decay 0.65 --output_dir ./finetune \
100
+ --weight_decay 0.05 --drop_path 0.1 --mixup 0.8 --cutmix 1.0 --reprob 0.25 \
101
+ --dist_eval --data_path ../data/
102
+ ```
103
+
104
+ ---
105
+
106
+ ## Limitations & Notes
107
+
108
+ **Disclaimer:** This repository is intended for **academic research purposes only**.
109
+ - The model requires access to the original datasets for pretraining and downstream evaluation.
110
+ - Training performance may vary depending on model size, masking ratio, and distributed training configuration.
111
+ - Users should prepare the dataset following the MAE protocol before reproduction.
112
+
113
+ ---
114
+
115
+ ## Citation
116
+
117
+ If you find our work useful in your research, please consider citing our paper:
118
+
119
+ ```bibtex
120
+ @inproceedings{liu2025curmim,
121
+ title={CurMIM: Curriculum Masked Image Modeling},
122
+ author={Liu, Hao and Wang, Kun and Han, Yudong and Wang, Haocong and Hu, Yupeng and Wang, Chunxiao and Nie, Liqiang},
123
+ booktitle={2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
124
+ pages={1--5},
125
+ year={2025},
126
+ doi={10.1109/ICASSP49660.2025.10890877}
127
+ }
128
+ ```
129
+
130
+ ---
131
+ ## Contact
132
+ **If you have any questions, feel free to contact me at liuh90210@gmail.com**.