metadata
license: mit
datasets:
- jatinmehra/CSIRO-Cleaned-Data
metrics:
- Weighted R2 Scores
base_model:
- facebook/dinov3-vith16plus-pretrain-lvd1689m
pipeline_tag: image-feature-extraction
library_name: timm
language:
- en
tags:
- agriculture
- ComputerVision
CSIRO Pasture Biomass Prediction (4th Place Solution)
This model card covers the main and auxiliary models used in the Kaggle CSIRO Pasture Biomass Prediction competition (4th place). The solution uses a ViT-Huge DINOv3 backbone with multi-modal fusion and an auxiliary pretraining stage.
Model Description
Main model (Stage 1)
- Backbone:
vit_huge_plus_patch16_dinov3.lvd1689m(DINOv3) - Input: RGB pasture image + tabular features (
Pre_GSHH_NDVI,Height_Ave_cm) - Output: Biomass regression targets
- Loss: Weighted Smooth L1 with weights
[0.1, 0.1, 0.1, 0.2, 0.5]
Aux model (Stage 2)
- Input: RGB pasture image only
- Output: Predicts tabular features (NDVI, Height)
- Purpose: Feature enrichment for the main model
Model Weights
- Main model: https://huggingface.co/jatinmehra/CSIRO-DinoV3-HugePlus-LB76
- Aux model: https://huggingface.co/jatinmehra/CSIRO-AUX_MODEL
Intended Use
- Research and benchmarking for pasture biomass regression from aerial/ground images.
- Demonstration of multi-modal fusion and auxiliary prediction benefits.
Out-of-Scope Use
- Medical or safety-critical applications.
- Real-time decision systems without domain validation.
Training Data
- Kaggle CSIRO Pasture Biomass Prediction dataset.
- Images were manually cropped to remove cardboard background.
Preprocessing
- Manual cropping to remove cardboard borders.
- Tabular features and regression targets normalized via
StandardScaler. - Image size: 800×800.
Training Configuration
Stage 1 (Main Model)
- Batch size: 10
- Optimizer: AdamW
- LR: 5e-5
- Scheduler: CosineAnnealingWarmRestarts (T_0=10, T_mult=2, eta_min=1e-6)
- 5-fold CV (Seed 42)
Stage 2 (Aux Model)
- Batch size: 8
- Optimizer: AdamW
- LR: 5e-5
- Scheduler: ReduceLROnPlateau (factor=0.5, patience=4)
- 5-fold CV (Seed 44)
Results
| Stage | Public LB | Private LB |
|---|---|---|
| Baseline | 0.74 | 0.64 |
| + Data Cleaning | 0.75 | 0.65 |
| + Auxiliary Training | 0.76 | 0.66 |
Limitations
- Trained on competition data with specific capture setup; generalization to other pasture imagery may vary.
- Large ViT-Huge backbone is compute-intensive.
Ethical Considerations
- Dataset-specific biases may affect generalization.
- Manual preprocessing may not be reproducible without the same data access.
Citation
If you use these weights, please cite the Kaggle competition solution.
License
MIT