chitter99
/

vit_oxford_pets_patch16_128

Image Classification

vision-transformer

Eval Results (legacy)

Model card Files Files and versions

chitter99 commited on May 7, 2025

Commit

dabb0b0

·

verified ·

1 Parent(s): 6631263

Update README.md

Files changed (1) hide show

README.md +73 -3

README.md CHANGED Viewed

@@ -1,3 +1,73 @@
----
-license: mit
----

+---
+license: apache-2.0
+tags:
+  - image-classification
+  - vision-transformer
+  - pytorch
+  - oxford-pets
+library_name: torch
+datasets:
+  - cvdl/oxford-pets
+language: []
+model-index:
+  - name: ViTPets
+    results:
+      - task:
+          type: image-classification
+        dataset:
+          name: Oxford Pets
+          type: cvdl/oxford-pets
+        metrics:
+          - type: accuracy
+            value: 9
+---
+# ViTPets - Vision Transformer trained from scratch on Oxford Pets 🐶🐱
+This model is a Vision Transformer (ViT) trained from scratch on the [Oxford Pets dataset](https://huggingface.co/datasets/cvdl/oxford-pets). It classifies images of cats and dogs into 37 different breeds.
+## Model Summary
+- **Architecture**: Custom Vision Transformer (ViT)
+- **Input resolution**: 128x128
+- **Patch size**: 16x16
+- **Embedding dimension**: 240
+- **Number of Transformer blocks**: 12
+- **Number of heads**: 4
+- **MLP ratio**: 2.0
+- **Dropout**: 10% on attention and MLP
+- **Framework**: PyTorch
+- **Dataset**: Oxford Pets (via 🤗 `cvdl/oxford-pets`)
+- **Loss**: CrossEntropyLoss
+- **Optimizer**: SGD with LR = 0.00257
+## Training Setup
+- **Device**: Multi-GPU (4 GPUs)
+- **Batch size**: 256 (64 × 4 GPUs)
+- **Early stopping**: patience 50, delta 1e-6
+- **Logging**: TensorBoard
+## How to Use
+```python
+from model import ViT
+import torch
+model = ViT(
+    img_size=(128, 128),
+    patch_size=16,
+    in_channels=3,
+    embed_dim=240,
+    n_classes=37,
+    n_blocks=12,
+    n_heads=4,
+    mlp_ratio=2.0,
+    qkv_bias=True,
+    block_drop_p=0.1,
+    attn_drop_p=0.1,
+)
+model.load_state_dict(torch.load("ViTPets.pth"))
+model.eval()
+```