--- license: apache-2.0 library_name: timm tags: - image-classification - plant-disease - dinov2 - rocm - mi300x - amd base_model: facebook/dinov2-large datasets: - mendeley/crop-pest-and-disease-detection --- # DINOv2-Large — CCMT Crop & Disease (MI300X fine-tune) Fine-tuned **DINOv2-Large** (304M params) on the **CCMT crop-pest-and-disease** dataset (22 classes across cashew, cassava, maize, tomato). Trained on a single **AMD Instinct MI300X** using PyTorch + ROCm, as a submission to the lablab.ai AMD hackathon **Track 2 — Fine-Tuning on AMD GPUs**. ## 🌱 Try the live demo This model is deployed as an interactive Gradio Space — upload a leaf photo and get an instant diagnosis with treatment guidance: **👉 [https://huggingface.co/spaces/iamcode6/merolav-space](https://huggingface.co/spaces/iamcode6/merolav-space)** ## Results | Metric | This model (DINOv2-L / MI300X) | Baseline (EfficientNetB0 / P100) | |---------------------|-------------------------------:|---------------------------------:| | Test accuracy | 0.9706 (TTA) | 0.9316 (TTA) | | Macro F1 | 0.9713 | 0.9348 | | Standard acc (no TTA) | 0.9705 | — | TTA rounds: 10. ## Training - **Backbone:** DINOv2-L ViT-L/14 (self-supervised, LVD-142M pretrain) - **Precision:** bf16 (native MI300X) - **Schedule:** 2-phase — linear probe → full fine-tune with layer-wise LR decay - **Optimizer:** AdamW, cosine schedule, grad-clip 1.0 - **Augmentation:** RandAugment + Mixup/CutMix + RandomErasing See `config.yaml` for the full hyperparameter set. ## Usage ```python import timm, torch model = timm.create_model( "vit_large_patch14_dinov2.lvd142m", pretrained=False, num_classes=22, img_size=224, ) ckpt = torch.load("best.pt", map_location="cpu", weights_only=False) model.load_state_dict(ckpt["state_dict"]) model.eval() ``` Class index map is embedded inside the checkpoint under `cfg`; see the training repo for `splits.json` which defines the `class_to_idx` mapping. ## Artifacts - `best.pt` — model weights + training config - `config.yaml` — hyperparameters used for this run - `classification_report.txt` — per-class precision / recall / F1 - `confusion_matrix.csv` — 22×22 confusion matrix - `metrics.json` — standard + TTA scores ## Source Training code: