palubad
/

SAR-based-VIs-models

Joblib

Model card Files Files and versions

xet

Community

palubad commited on Jan 30, 2025

Commit

4606005

verified ·

1 Parent(s): c62f6bb

Update README.md

Browse files

Files changed (1) hide show

README.md +152 -3

README.md CHANGED Viewed

@@ -1,3 +1,152 @@
----
-license: mit
----

+---
+license: mit
+---
+# Models to estimate SAR-based Vegetation indices and biophysical variables
+This study presents a machine learning-based approach to estimate optical vegetation indices and biophysical variables (hereafter referred to as VIs) using synthetic aperture radar (SAR) and ancillary data for forest monitoring.
+The best-performing models were Random Forest Regressor (RFR) for LAI and FAPAR and XGBoost (XGB) for EVI and NDVI. These models were trained on temporally and spatially aligned time series (TS) datasets, containing Sentinel-1 SAR data, Sentinel-2 multispectral data, DEM-based features and meteorological variables. It provides an accurate and timely alternative to optical-based VIs.
+These models are part of the paper Paluba, D., Le Saux, B., Sarti, F., Štych, P. (2025): Estimating vegetation indices and biophysical parameters for Central European temperate forests with Sentinel-1 SAR data and machine learning. Published in Big Earth Data
+## Model Details
+### Model Description
+The study explores the feasibility of using SAR-based features in combination with additional datasets (e.g., DEM-based features and meteorological data) to estimate optical VIs, specifically, LAI, FAPAR, EVI and NDVI. Traditional optical remote sensing methods are often hindered by cloud cover, making it difficult to obtain continuous and reliable vegetation monitoring data. This research addresses this challenge by applying SAR data, which is unaffected by atmospheric conditions.
+Using ML, particularly RFR and XGB, the study demonstrates that SAR-based VIs can replicate the patterns of optical-based VIs, while also offering advantages such as higher temporal resolution and all-year monitoring. The inclusion of ancillary data improves model accuracy, particularly in differentiating forest types and seasonal variations. The transferability tests confirm that the methodology generalizes well across Central European forests and shows potential for large-scale monitoring applications.
+- **Developed by:** Daniel Paluba, Bertrand Le Saux
+- **Funded by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **License:** CC BY 4.0
+### Model Sources
+**Repositories:**
+- [GitHub SAR-based-VIs](https://github.com/palubad/SAR-based-VIs) for data and for data generation.
+- [MMTS-GEE](https://github.com/palubad/MMTS-GEE) to generate multi-modal and time series datasets with spatially and temporally aligned data.
+**Paper:**
+Paluba et al. 2025: Estimating vegetation indices and biophysical parameters for Central European temperate forests with Sentinel-1 SAR data and machine learning. Published in Big Earth Data.
+**Demo:**
+  Will be provided soon.
+#### Scenarios Where the Model May Not Work Well
+- Forest areas significantly different from those in the training data (e.g., tropical rainforests, drylands).
+- Extreme weather conditions (e.g., snow, heavy rain) affecting SAR signal interpretation.
+- Recently disturbed areas with high structural variability, leading to noisier results.
+#### Known limitations
+- Reliance on forest type classification: Errors in input forest type maps can propagate into the VI estimations.
+- The model's effectiveness in disturbed forests is lower than in healthy forests, which may affect early disturbance detection.
+- Seasonal variations introduce noise, particularly in winter, affecting model accuracy.
+#### Recommendations to overcome the limitations - future work
+- Ensure diverse training data covering different forest types and disturbance scenarios to improve generalizability.
+- Complement SAR-based estimations with optical data when available to enhance accuracy.
+- Improve noise reduction techniques and incorporate multi-band SAR data (L-, P-bands) in future studies for better vegetation characterization.
+## How to Get Started with the Model
+To implement this model:
+- Prepare input datasets using the MMTS-GEE tool: Collect Sentinel-1 SAR data, DEM-based features, and meteorological variables.
+- Preprocess data: Use the MMTS-GEE tool for temporal and spatial alignment.
+- Train the model: Implement RFR for LAI/FAPAR and XGB for EVI/NDVI using optimized hyperparameters.
+- Evaluate performance: Compare model outputs with Sentinel-2-based VIs to validate accuracy.
+- Deploy for inference: Apply trained models to monitor vegetation indices in new regions or for near real-time applications.
+[More Information Needed]
+## Training Details
+### Training Data
+The model was trained on:
+- Sentinel-1 SAR time series (VH and VV polarizations).
+- Sentinel-2 optical vegetation indices (LAI, FAPAR, EVI, NDVI) as ground truth.
+- Digital Elevation Model (DEM)-based features (elevation, slope, LIA).
+- Meteorological variables (temperature, precipitation).
+- Forest type maps (broad-leaved vs. coniferous).
+- Geographic scope: Czechia for training, validated on Central European forests.
+### Training Procedure
+Feature Selection: Using permutation feature importance analysis to identify key predictors.
+Data Splitting: Training and validation sets created with a balanced representation of healthy and disturbed forests.
+Hyperparameter Optimization:
+RFR: Fine-tuned for maximum depth, number of trees, and minimum samples per split.
+XGB: Optimized learning rate, tree depth, and number of boosting rounds.
+Model Training: Using scikit-learn and XGBoost libraries with MAE loss function.
+Computational Requirements:
+XGB: Faster training with built-in early stopping (~30-70x faster than RFR).
+RFR: Slower but slightly better performance for LAI/FAPAR.
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+Mean Absolute Error (MAE): Primary metric for accuracy.
+R² Score: To assess correlation with Sentinel-2 VIs.
+Transferability Test: Applied to different Central European forests.
+### Results
+Best models:
+RFR performed best for LAI (MAE ~0.06) and FAPAR.
+XGB performed best for EVI and NDVI.
+SAR-based VIs successfully replicated optical VIs, with clear seasonal and forest-type differentiation.
+Higher MAEs observed in NDVI estimation (~0.48), attributed to forest type inaccuracies and change detection errors.
+SAR-based VIs detected forest changes up to 4 days earlier than Sentinel-2 VIs, significantly improving change detection capabilities.
+Adding DEM and meteorological features improved R² by 3-4%.
+#### Summary
+### Used computation infrastructure
+12th Gen Intel(R) Core(TM) i7-12700 with 2.10 GHz, 64 Gigabyte of RAM and 20 CPU cores.
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]