MobileCLIP2
MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B
- Paper • 2508.20691 • Published • 7
-
apple/MobileCLIP2-S0
Updated • 160 • 47 -
apple/MobileCLIP2-S2
Updated • 83 • 16 -
apple/MobileCLIP2-B
Updated • 65 • 3 -
apple/MobileCLIP2-S3
Updated • 56 • 5 -
apple/MobileCLIP2-S4
Updated • 76 • 14
apple/MobileCLIP2-L-14
Updated • 46 • 4Note Timm ViT-L/14 architecture trained on DFNDR-2B (dataset of MobileCLIP2)
apple/MobileCLIP-S3
Updated • 70 • 5Note New architecture introduced in MobileCLIP2 paper but pretrained on DataCompDR (dataset of MobileCLIP v1)
apple/MobileCLIP-S4
Updated • 88 • 9Note New architecture introduced in MobileCLIP2 paper but pretrained on DataCompDR (dataset of MobileCLIP v1)
apple/MobileCLIP-L-14
Updated • 35 • 1Note Timm ViT-L/14 architecture pretrained on DataCompDR (dataset of MobileCLIP v1)
timm/MobileCLIP2-S0-OpenCLIP
Zero-Shot Image Classification • Updated • 4.61k • 1Note 👇Timm checkpoints
-
timm/MobileCLIP2-S2-OpenCLIP
Zero-Shot Image Classification • Updated • 7.96k • 5 -
timm/MobileCLIP2-B-OpenCLIP
Zero-Shot Image Classification • Updated • 2.69k • 1 -
timm/MobileCLIP2-S3-OpenCLIP
Zero-Shot Image Classification • Updated • 3.75k • 3 -
timm/MobileCLIP2-S4-OpenCLIP
Zero-Shot Image Classification • Updated • 702 • 2 -
timm/MobileCLIP2-L-14-OpenCLIP
Zero-Shot Image Classification • Updated • 182 • 2
apple/mobileclip2_coca_dfn2b_s13b_mscoco38k_s12m_context77
Updated • 35 • 1Note 👇MobileCLIP2 CoCa models for synthetic caption generation used to train MobileCLIP2 models
-
apple/mobileclip2_coca_dfn2b_s13b_gbc1m-short_context77
Updated • 28 • 1 -
apple/mobileclip2_coca_dfn2b_s13b_docci_s12m_context77
Updated • 31 • 1 -
apple/mobileclip2_coca_dfn2b_s13b_dci-short_s12m_context77
Updated • 32 • 1 -
apple/mobileclip2_coca_dfn2b_s13b_dci-extended_s12m_context77
Updated • 36 • 1 -
apple/mobileclip2_coca_dfn2b_s13b_dci-complete_s12m_context77
Updated • 31 • 1 -
apple/mobileclip2_coca_dfn2b_s13b_recap-coco-30k_s12m_context77
Updated • 29 • 1
apple/mobileclip2_coca_dfn2b_s13b_docci_s12m_context256
Updated • 28 • 1Note 👇MobileCLIP2 CoCa models (context length=256). Higher chance of generating repeated output.
-
apple/mobileclip2_coca_dfn2b_s13b_dci-complete_s12m_context256
Updated • 34 • 1 -
apple/mobileclip2_coca_dfn2b_s13b_dci-extended_s12m_context256
Updated • 45 • 1
apple/mobileclip2_coca_dfn2b_s13b_context77
Updated • 41 • 1Note MobileCLIP2 CoCa base model. It can be used for fine-tuning new CoCa models on high quality datasets.
apple/DFNDR-12M
Viewer • Updated • 12.8M • 1.73k • 5Note 👇DFNDR: MobileCLIP2 Pretraining datasets
-
apple/DFNDR-12M-bf16
Viewer • Updated • 12.8M • 237 • 2 -
apple/DFNDR-2B
Updated • 361 • 3