Image Feature Extraction
Transformers
JAX
Safetensors
MLX
PyTorch
aimv2_vision_model
vision
custom_code
Eval Results (legacy)
Instructions to use apple/aimv2-large-patch14-224 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use apple/aimv2-large-patch14-224 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-feature-extraction", model="apple/aimv2-large-patch14-224", trust_remote_code=True)# Load model directly from transformers import AutoImageProcessor, AutoModel processor = AutoImageProcessor.from_pretrained("apple/aimv2-large-patch14-224", trust_remote_code=True) model = AutoModel.from_pretrained("apple/aimv2-large-patch14-224", trust_remote_code=True) - MLX
How to use apple/aimv2-large-patch14-224 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir aimv2-large-patch14-224 apple/aimv2-large-patch14-224
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Fix AIMv2 checkpoint/config compatibility with custom HF code
Browse filesThis PR makes the uploaded checkpoint consistent with the custom `modeling_aimv2.py` implementation shipped in the repository.
- remap tensor names in `model.safetensors` to match `AIMv2Model.state_dict()`
- update `config.json` so `architectures` and `model_type` align with the shipped config/model classes
Without these changes, loading via `AutoModel.from_pretrained(..., trust_remote_code=True)` initializes most model weights from scratch instead of loading the published checkpoint.
- config.json +2 -2
- model.safetensors +2 -2
config.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"architectures": [
|
| 3 |
-
"
|
| 4 |
],
|
| 5 |
"attention_dropout": 0.0,
|
| 6 |
"auto_map": {
|
|
@@ -15,7 +15,7 @@
|
|
| 15 |
"intermediate_size": 2816,
|
| 16 |
"is_native": false,
|
| 17 |
"mlp_bias": false,
|
| 18 |
-
"model_type": "
|
| 19 |
"num_attention_heads": 8,
|
| 20 |
"num_channels": 3,
|
| 21 |
"num_hidden_layers": 24,
|
|
|
|
| 1 |
{
|
| 2 |
"architectures": [
|
| 3 |
+
"AIMv2Model"
|
| 4 |
],
|
| 5 |
"attention_dropout": 0.0,
|
| 6 |
"auto_map": {
|
|
|
|
| 15 |
"intermediate_size": 2816,
|
| 16 |
"is_native": false,
|
| 17 |
"mlp_bias": false,
|
| 18 |
+
"model_type": "aimv2",
|
| 19 |
"num_attention_heads": 8,
|
| 20 |
"num_channels": 3,
|
| 21 |
"num_hidden_layers": 24,
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2d467ee5b52efcfed8bc18fe10e63879d049a03ed9adbd2529a4657f32355ff2
|
| 3 |
+
size 1236809360
|