FoMo4Wheat / README.md
nielsr's picture
nielsr HF Staff
Improve model card: add metadata, paper/code/project links, and update content
93ca16b verified
|
raw
history blame
4.71 kB
---
license: mit
pipeline_tag: image-feature-extraction
---
# FoMo4Wheat: Toward reliable crop vision foundation models with globally curated data
Paper: [https://huggingface.co/papers/2509.06907](https://huggingface.co/papers/2509.06907)
Project Page: https://fomo4wheat.phenix-lab.com/
Code: https://github.com/PheniX-Lab/FoMo4Wheat
## Abstract
Vision-driven field monitoring is central to digital agriculture, yet models built on general-domain pretrained backbones often fail to generalize across tasks, owing to the interaction of fine, variable canopy structures with fluctuating field conditions. We present FoMo4Wheat, one of the first crop-domain vision foundation model pretrained with self-supervision on ImAg4Wheat, the largest and most diverse wheat image dataset to date (2.5 million high-resolution images collected over a decade at 30 global sites, spanning >2,000 genotypes and >500 environmental conditions). This wheat-specific pretraining yields representations that are robust for wheat and transferable to other crops and weeds. Across ten in-field vision tasks at canopy and organ levels, FoMo4Wheat models consistently outperform state-of-the-art models pretrained on general-domain dataset. These results demonstrate the value of crop-specific foundation models for reliable in-field perception and chart a path toward a universal crop foundation model with cross-species and cross-task capabilities. FoMo4Wheat models and the ImAg4Wheat dataset are publicly available online: this https URL and this https URL . The demonstration website is: this https URL .
## Demo
The demonstration website for inferring embeddings is located at [Demo](https://fomo4wheat.phenix-lab.com/).
https://github.com/user-attachments/assets/2f2f21b4-4638-41c6-8bdf-37d8ad458eb6
🎥 **Visualization of Unlabeled wheat features.**
## Method
<img width="1267" height="1459" alt="Fig 1" src="https://github.com/user-attachments/assets/1d095d9b-2de4-4080-b68c-7da83f12edc1" />
<b>Fig 1.</b> Overview of ImAg4Wheat dataset and FoMo4Wheat model.
## Installation
The training and evaluation code is developed with PyTorch 2.5.1 and requires Linux environment with multiple third-party dependencies. To set up all required dependencies for training and evaluation, please follow the instructions below:
```
conda env create -f conda.yaml
conda activate FoMo4Wheat
```
## Data Preparation
ImAg4Wheat comprises 2,500,000 million images over 2,000 wheat genotypes cultivated under 500 distinct environmental conditions across 30 sites in 10 countries spanning a decade, covering the full crop growth cycle. [ImAg4Wheat](https://huggingface.co/datasets/PheniX-Lab/ImAg4Wheat)
(Note: The complete dataset will be made publicly available after the peer-review process of the associated paper is completed.)
## Pretrained models
| model | # of params | download |
| :---------------------:| -----------: |:--------------:|
| ViT-B/14 | 86 M | [FoMo4Wheat_base.pth](https://huggingface.co/PheniX-Lab/FoMo4Wheat/blob/main/weight/FoMo4Wheat_base.pth) |
| ViT-L/14 | 300 M | [FoMo4Wheat_large.pth](https://huggingface.co/PheniX-Lab/FoMo4Wheat/blob/main/weight/FoMo4Wheat_large.pth) |
| ViT-G/14 | 1,100 M | [FoMo4Wheat_giant.pth](https://huggingface.co/PheniX-Lab/FoMo4Wheat/blob/main/weight/FoMo4Wheat_giant.pth) |
## Training
**Training FoMo4Wheat on ImAg4Wheat**
Run FoMo4Wheat training on 6 A800-80GB nodes (48 GPUs) in a SLURM cluster environment with submitit:
```
MKL_NUM_THREADS=8 OMP_NUM_THREADS=8 python FoMo4Wheat/run/train/ \
--nodes 6 \
--config-file FoMo4Wheat/configs/train/vitg_14_224.yaml \
--output-dir <PATH/TO/OUTPUT/DIR> \
train.dataset_path=TestDataset:split=TRAIN:root=<PATH/TO/DATASET>:extra=<PATH/TO/DATASET>
```
## License
FoMo4Wheat code and model weights are released under the MIT License. See LICENSE for additional details.
## Citation
If you use our project in your research or wish to refer to the results of the project, please use the following BibTeX entry.
```bibtex
@article{2025FoMo4Wheat,
title={FoMo4Wheat: Toward reliable crop vision foundation models with globally curated data},
author={Bing Han, Chen Zhu, Dong Han, Rui Yu, Songliang Cao, Jianhui Wu, Scott Chapman, Zijian Wang, Bangyou Zheng, Wei Guo, Marie Weiss, Benoit de Solan, Andreas Hund, Lukas Roth, Kirchgessner Norbert, Andrea Visioni, Yufeng Ge, Wenjuan Li, Alexis Comar, Dong Jiang, Dejun Han, Fred Baret, Yanfeng Ding, Hao Lu and Shouyang Liu},
journal={arXiv:2509.06907},
year={2025}
note={contact:Shouyang Liu (shouyang.liu@njau.edu.cn),Hao Lu (hlu@hust.edu.cn),Yanfeng Ding (dingyf@njau.edu.cn)}
}
```