---
language: en
license: mit
tags:
  - computer-vision
  - image-segmentation
  - plant-disease
  - agricultural-ai
  - foundation-model
  - sam
  - yolo
  - coffee
  - rust-disease
datasets:
  - coffee-leaf-rust-severity
metrics:
  - iou
  - dice
  - precision
  - recall
  - lin-concordance-correlation-coefficient
---

Foundation Model–Assisted Coffee Leaf Rust Severity Estimation

This repository accompanies the manuscript:

Foundation model–assisted segmentation enables robust field-based severity estimation of coffee leaf rust

This project presents a fully reproducible computer vision pipeline for quantitative estimation of coffee leaf rust (Hemileia vastatrix) severity under heterogeneous field conditions. The framework integrates object detection, lesion segmentation, pixel-based severity quantification, and concordance analysis grounded in phytopathometry principles.

The study compares classical image processing, supervised deep learning, and foundation segmentation models for lesion detection, and evaluates agreement with gold-standard pixel-level annotations using Lin’s Concordance Correlation Coefficient (LCCC).


Data Card Author(s)
Mary Paz Romero Benavides, Universidade Federal de Viçosa: Owner / Manager
Emerson M. Del Ponte, Universidade Federal de Viçosa: Contributor
Waldênia de Melo Moura, EPAMIG: Contributor


🌱 Project Overview

The methodological workflow consists of:

Leaf Detection – YOLOv8 trained using model-assisted annotations

Leaf Extraction – Detection-guided segmentation

Lesion Segmentation – Comparison of five approaches:

ImageJ thresholding

pliman (R package)

DeepLabV3+

Fine-tuned SAM2 (SAM_CLR)

Zero-shot SAM3

Severity Estimation – Pixel-based calculation:

S (%) = Diseased Area / Leaf Area × 100

Agreement Analysis – Lin’s Concordance Correlation Coefficient between predicted and reference severity

📊 Dataset Summary

The full dataset comprises:

1,285 field-acquired coffee leaf images

606 curated pixel-level rust lesion masks

100 independent evaluation masks

Roboflow dataset links:

CLR_SAM_dataset:
https://universe.roboflow.com/clr-zky50/sam_clr/dataset/1

DL506:
https://universe.roboflow.com/clr-zky50/dl506/dataset/1

GoldenStandard:
https://universe.roboflow.com/clr-zky50/imgtest-fvn9j/dataset/1

📂 Repository Structure
📁 01_models

Contains documentation describing the trained models used in this study.

⚠️ Due to GitHub file size limitations, model weights are hosted on Hugging Face.

Models include:

YOLOv8 leaf detector

Fine-tuned SAM2 (SAM_CLR)

DeepLabV3+

Configuration used for zero-shot SAM3 inference

📁 02_binary_images

Contains validation binary masks (PNG format) corresponding to segmentation outputs from each evaluated method.

These masks were used to compute:

Intersection over Union (IoU)

Dice coefficient

Pixel accuracy

Precision

Recall

Disease severity (%)

Lin’s Concordance Correlation Coefficient (LCCC)

Binary mask format:

0 → background

255 → rust lesion

This folder enables independent verification of segmentation performance and severity calculations.

📁 03_analysis

Contains R scripts used to:

Compute severity metrics

Perform agreement and concordance analysis

Generate all figures included in the manuscript

Main R dependencies:

tidyverse

epiR

lme4

ggplot2

This folder reproduces the statistical analysis pipeline described in the paper.

🔬 Reproducibility

This repository provides:

Validation segmentation outputs

Statistical analysis scripts

Model documentation

External links to trained weights

Together, these components allow full reproducibility of segmentation metrics and severity agreement results reported in the manuscript.

🤖 Model Hosting

All trained model weights are hosted on Hugging Face:

👉 https://huggingface.co/MaryPazRB/Paper_CLR_CV

This ensures accessibility without exceeding GitHub file size limitations.

📜 License

Code: MIT License

Binary masks and annotations: CC-BY 4.0


For questions or collaboration inquiries, please open an issue or contact the corresponding author.