PheniX-Lab commited on
Commit
738cc8c
·
verified ·
1 Parent(s): b6ced23

Upload 3 files

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +27 -0
  3. conda.yaml +26 -0
  4. video.mp4 +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ video.mp4 filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FoMo4Wheat
2
+ The official implementation of the paper **Crop-specific Vision Foundation Model enabling Generalized Field Monitoring**
3
+ # Abstract
4
+ Vision-driven in-field crop monitoring is essential for advancing digital agriculture whether supporting commercial decisions on-farm or augmenting research experiments in breeding and agronomy. Existing crop vision models struggle to generalize across fine-scale, highly variable canopy structures, and fluctuating outdoor environments. In this work, we present FoMo4Wheat, one of the first crop-orientated vision foundation models and demonstrate that delivers strong performance across a wide range of agricultural vision tasks. Centered on wheat, the most globally significant food crop, we curated ImAg4Wheat—the largest and most diverse wheat image dataset to date. It comprises 2.5 million high-resolution images collected over a decade from breeding and experimental fields, spanning more than 2,000 genotypes and 500 distinct environmental conditions across 30 global sites. A suite of FoMo4Wheat models was pre-trained using self-supervised learning on this dataset. Benchmark results across ten crop-related downstream tasks show that FoMo4Wheat consistently outperforms state-of-the-art models trained on general-domain datasets. Beyond strong cross-task generalization within wheat crops, FoMo4Wheat is highly robust in limited-data regimes but on previously unseen crop data. Notably, it contributes significantly to vision tasks in rice and multiplw crop/weed images, highlighting its cross-crop adaptability. In delivering one of the first open-source foundation models for wheat, our results demonstrate the value of such crop-specific foundation models that will support the development of versatile high-performing vision systems in crop breeding and precision agriculture. 
5
+ # Demo
6
+ ![Demo](./video.mp4)
7
+ # Method
8
+ <img width="1256" height="1460" alt="image" src="https://github.com/user-attachments/assets/89b475ab-d8c3-4997-a4ec-bd4062b2f986" />
9
+ <b>Fig 1.</b> Overview of ImAg4Wheat dataset and FoMo4Wheat model.
10
+
11
+ # Installation
12
+ The training and evaluation code is developed with PyTorch 2.5.1 and requires Linux environment with multiple third-party dependencies. To set up all required dependencies for training and evaluation, please follow the instructions below:
13
+ ```
14
+ conda env create -f conda.yaml
15
+ conda activate FoMo4Wheat
16
+ ```
17
+ # Data Preparation
18
+
19
+ # Training
20
+ ```
21
+ MKL_NUM_THREADS=8 OMP_NUM_THREADS=8 python FoMo4Wheat/run/train/
22
+ --nodes 6 \
23
+ --config-file FoMo4Wheat/configs/train/vitg_14_224.yaml \
24
+ --output-dir <PATH/TO/OUTPUT/DIR> \
25
+ train.dataset_path=TestDataset:split=TRAIN:root=<PATH/TO/DATASET>:extra=<PATH/TO/DATASET>
26
+ ```
27
+ # License
conda.yaml ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: FoMo4Wheat
2
+ channels:
3
+ - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
4
+ - defaults
5
+ - pytorch
6
+ - nvidia
7
+ - xformers
8
+ - conda-forge
9
+ show_channel_urls: true
10
+ default_channels:
11
+ - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
12
+ dependencies:
13
+ - python=3.11
14
+ - pytorch::pytorch=2.5.1
15
+ - pytorch::pytorch-cuda=12.1
16
+ - pytorch::torchvision
17
+ - omegaconf
18
+ - torchmetrics
19
+ - fvcore
20
+ - iopath
21
+ - xformers::xformers
22
+ - pip
23
+ - pip:
24
+ - git+https://github.com/facebookincubator/submitit
25
+ - --extra-index-url https://pypi.nvidia.com
26
+ - cuml-cu11
video.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70256dbe9d8e8cd806c9a56c3142809e3e60bddabd0627c22b86f687f433e3bc
3
+ size 6830329