nielsr HF Staff commited on
Commit
3a7e5df
·
verified ·
1 Parent(s): eba5ef4

Add pipeline tag, license and model checkpoints

Browse files

This PR adds the `pipeline_tag` as `image-classification` and the `library_name` as `pytorch` to the metadata to ensure the model shows up in the image classification category.
It also adds the `license` based on the information in the GitHub README.
The model checkpoints are also added to the model card to make them more easily accessible.

Files changed (1) hide show
  1. README.md +62 -2
README.md CHANGED
@@ -1,9 +1,12 @@
1
  ---
 
 
2
  tags:
3
  - mae
4
  - crossmae
5
- datasets:
6
- - imagenet-1k
 
7
  ---
8
 
9
  ## CrossMAE: Rethinking Patch Dependence for Masked Autoencoders
@@ -19,3 +22,60 @@ by <a href="https://max-fu.github.io">Letian Fu*</a>, <a href="https://tonylian.
19
  This repo has the models for [CrossMAE: Rethinking Patch Dependence for Masked Autoencoders](https://arxiv.org/abs/2401.14391).
20
 
21
  Please take a look at the [GitHub repo](https://github.com/TonyLianLong/CrossMAE) to see instructions on pretraining, fine-tuning, and evaluation with these models.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ datasets:
3
+ - imagenet-1k
4
  tags:
5
  - mae
6
  - crossmae
7
+ pipeline_tag: image-classification
8
+ library_name: pytorch
9
+ license: cc-by-nc-4.0
10
  ---
11
 
12
  ## CrossMAE: Rethinking Patch Dependence for Masked Autoencoders
 
22
  This repo has the models for [CrossMAE: Rethinking Patch Dependence for Masked Autoencoders](https://arxiv.org/abs/2401.14391).
23
 
24
  Please take a look at the [GitHub repo](https://github.com/TonyLianLong/CrossMAE) to see instructions on pretraining, fine-tuning, and evaluation with these models.
25
+
26
+ <table><tbody>
27
+ <!-- START TABLE -->
28
+ <!-- TABLE HEADER -->
29
+ <th valign="bottom"></th>
30
+ <th valign="bottom">ViT-Small</th>
31
+ <th valign="bottom">ViT-Base</th>
32
+ <th valign="bottom">ViT-Base<sub>448</sub></th>
33
+ <th valign="bottom">ViT-Large</th>
34
+ <th valign="bottom">ViT-Huge</th>
35
+ <!-- TABLE BODY -->
36
+ <tr><td align="left">pretrained checkpoint</td>
37
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vits-mr0.75-kmr0.75-dd12/imagenet-mae-cross-vits-pretrain-wfm-mr0.75-kmr0.75-dd12-ep800-ui.pth?download=true'>download</a></td>
38
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vitb-mr0.75-kmr0.75-dd12/imagenet-mae-cross-vitb-pretrain-wfm-mr0.75-kmr0.75-dd12-ep800-ui.pth?download=true'>download</a></td>
39
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vitb-mr0.75-kmr0.75-dd12-448-400/imagenet-mae-cross-vitb-pretrain-wfm-mr0.75-kmr0.25-dd12-ep400-ui-res-448.pth?download=true'>download</a></td>
40
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vitl-mr0.75-kmr0.75-dd12/imagenet-mae-cross-vitl-pretrain-wfm-mr0.75-kmr0.75-dd12-ep800-ui.pth?download=true'>download</a></td>
41
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vith-mr0.75-kmr0.25-dd12/imagenet-mae-cross-vith-pretrain-wfm-mr0.75-kmr0.25-dd12-ep800-ui.pth?download=true'>download</a></td>
42
+ </tr>
43
+ <tr><td align="left">fine-tuned checkpoint</td>
44
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vits-mr0.75-kmr0.75-dd12/imagenet-mae-cross-vits-finetune-wfm-mr0.75-kmr0.75-dd12-ep800-ui.pth?download=true'>download</a></td>
45
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vitb-mr0.75-kmr0.75-dd12/imagenet-mae-cross-vitb-finetune-wfm-mr0.75-kmr0.75-dd12-ep800-ui.pth?download=true'>download</a></td>
46
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vitb-mr0.75-kmr0.75-dd12-448-400/imagenet-mae-cross-vitb-finetune-wfm-mr0.75-kmr0.25-dd12-ep400-ui-res-448.pth?download=true'>download</a></td>
47
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vitl-mr0.75-kmr0.75-dd12/imagenet-mae-cross-vitl-finetune-wfm-mr0.75-kmr0.75-dd12-ep800-ui.pth?download=true'>download</a></td>
48
+ <td align="center"><a href='https://huggingface.co/longlian/CrossMAE/resolve/main/vith-mr0.75-kmr0.25-dd12/imagenet-mae-cross-vith-finetune-wfm-mr0.75-kmr0.25-dd12-ep800-ui.pth?download=true'>download</a></td>
49
+ </tr>
50
+ <tr><td align="left"><b>Reference ImageNet accuracy (ours)</b></td>
51
+ <td align="center"><b>79.318</b></td>
52
+ <td align="center"><b>83.722</b></td>
53
+ <td align="center"><b>84.598</b></td>
54
+ <td align="center"><b>85.432</b></td>
55
+ <td align="center"><b>86.256</b></td>
56
+ </tr>
57
+ <tr><td align="left">MAE ImageNet accuracy (baseline)</td>
58
+ <td align="center"></td>
59
+ <td align="center"></td>
60
+ <td align="center">84.8</td>
61
+ <td align="center"></td>
62
+ <td align="center">85.9</td>
63
+ </tr>
64
+ </tbody></table>
65
+
66
+ ## Citation
67
+ Please give us a star 🌟 on Github to support us!
68
+
69
+ Please cite our work if you find our work inspiring or use our code in your work:
70
+ ```
71
+ @article{
72
+ fu2025rethinking,
73
+ title={Rethinking Patch Dependence for Masked Autoencoders},
74
+ author={Letian Fu and Long Lian and Renhao Wang and Baifeng Shi and XuDong Wang and Adam Yala and Trevor Darrell and Alexei A Efros and Ken Goldberg},
75
+ journal={Transactions on Machine Learning Research},
76
+ issn={2835-8856},
77
+ year={2025},
78
+ url={https://openreview.net/forum?id=JT2KMuo2BV},
79
+ note={}
80
+ }
81
+ ```