|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- earth-observation |
|
|
- remote-sensing |
|
|
- forest-monitoring |
|
|
- climate |
|
|
- sentinel-2 |
|
|
- vision-transformer |
|
|
- multi-modal |
|
|
- temporal |
|
|
- soil-prediction |
|
|
- carbon |
|
|
datasets: |
|
|
- sentinel-2 |
|
|
- global-forest-change |
|
|
- openlandmap |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: image-segmentation |
|
|
library_name: pytorch |
|
|
--- |
|
|
|
|
|
# Naturecode Earth |
|
|
|
|
|
**A Multi-Modal Temporal Vision Transformer for Forest Monitoring and Earth Observation** |
|
|
|
|
|
Naturecode Earth is a foundation model designed for comprehensive forest monitoring, combining satellite imagery analysis with soil property prediction. Built on a hierarchical vision transformer architecture, it processes multi-temporal Sentinel-2 imagery to provide actionable insights for climate and conservation applications. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
Naturecode Earth is a **10.9M parameter** model (nano variant) that processes: |
|
|
- **Sentinel-2 imagery**: 6 spectral bands (B2, B3, B4, B8, B11, B12) across 4 quarterly timestamps |
|
|
- **Location encoding**: Latitude/longitude embeddings for geographic context |
|
|
- **Temporal encoding**: Day-of-year embeddings for seasonal awareness |
|
|
|
|
|
### Architecture |
|
|
|
|
|
``` |
|
|
Input: (B, T, C, H, W) = (batch, 4 timestamps, 6 bands, 64, 64) |
|
|
+ timestamps: (B, T, 2) [year, day_of_year] |
|
|
+ locations: (B, 2) [lat, lon] |
|
|
|
|
|
Encoder: |
|
|
- Patch embedding (8x8 patches) |
|
|
- Factorized temporal + spatial attention |
|
|
- 4 transformer layers |
|
|
|
|
|
Decoder Heads: |
|
|
- Segmentation: Forest cover classification (3 classes) |
|
|
- Biomass: Above-ground biomass estimation (Mg/ha) |
|
|
- Change Detection: Deforestation/degradation mapping |
|
|
- Soil Properties: SOC, clay, sand, bulk density, water content |
|
|
``` |
|
|
|
|
|
## Intended Uses |
|
|
|
|
|
### Primary Use Cases |
|
|
- **Forest Cover Mapping**: Classify land into low/medium/high forest cover |
|
|
- **Deforestation Monitoring**: Detect forest loss and degradation |
|
|
- **Carbon Stock Estimation**: Estimate above-ground biomass |
|
|
- **Soil Carbon Prediction**: Predict soil organic carbon content |
|
|
|
|
|
### Supported Regions |
|
|
Trained on data from major tropical forest regions: |
|
|
- Amazon Basin (West, Central, East) |
|
|
- Congo Basin |
|
|
- Borneo |
|
|
|
|
|
## Training Data |
|
|
|
|
|
| Dataset | Description | Resolution | |
|
|
|---------|-------------|------------| |
|
|
| Sentinel-2 SR Harmonized | Surface reflectance imagery | 10m | |
|
|
| Global Forest Change (Hansen) | Tree cover, loss, gain | 30m | |
|
|
| OpenLandMap | Soil properties (SOC, clay, sand, bulk density) | 250m | |
|
|
|
|
|
**Training Configuration:** |
|
|
- 300 tiles (64x64 pixels at 10m = 640m x 640m) |
|
|
- 4 quarterly composites per tile (2024) |
|
|
- 100 epochs on NVIDIA A100 |
|
|
|
|
|
## How to Use |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install torch rasterio huggingface_hub |
|
|
``` |
|
|
|
|
|
### Inference |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
# Download model |
|
|
model_path = hf_hub_download( |
|
|
repo_id="naturecode/naturecode-earth", |
|
|
filename="model.pt" |
|
|
) |
|
|
|
|
|
# Load model |
|
|
checkpoint = torch.load(model_path, map_location="cpu") |
|
|
config = checkpoint["config"] |
|
|
|
|
|
# Your ForestFM model class here |
|
|
from forestfm.model import ForestFM |
|
|
from forestfm.config import ForestFMConfig |
|
|
|
|
|
model_config = ForestFMConfig(**config) |
|
|
model = ForestFM(model_config) |
|
|
model.load_state_dict(checkpoint["model_state_dict"]) |
|
|
model.eval() |
|
|
|
|
|
# Inference |
|
|
# images: (1, 4, 6, 64, 64) - 4 timestamps, 6 bands, 64x64 |
|
|
# timestamps: (1, 4, 2) - year and day_of_year |
|
|
# locations: (1, 2) - lat, lon |
|
|
with torch.no_grad(): |
|
|
outputs = model(images, timestamps, locations) |
|
|
|
|
|
# Forest segmentation |
|
|
seg_probs = outputs["segmentation"].softmax(dim=1) |
|
|
|
|
|
# Biomass estimation |
|
|
biomass = outputs["biomass"]["mean"] |
|
|
|
|
|
# Soil properties |
|
|
soc = outputs["soil"]["soc"] |
|
|
clay = outputs["soil"]["clay"] |
|
|
``` |
|
|
|
|
|
## Model Outputs |
|
|
|
|
|
| Output | Shape | Description | |
|
|
|--------|-------|-------------| |
|
|
| `segmentation` | (B, 3, H, W) | Forest cover logits (low/medium/high) | |
|
|
| `biomass.mean` | (B,) | Above-ground biomass (Mg/ha) | |
|
|
| `biomass.aleatoric_std` | (B,) | Uncertainty estimate | |
|
|
| `change.change_logits` | (B, 2, H, W) | Change detection logits | |
|
|
| `soil.soc` | (B,) | Soil organic carbon (g/kg) | |
|
|
| `soil.clay` | (B,) | Clay content (%) | |
|
|
| `soil.sand` | (B,) | Sand content (%) | |
|
|
| `soil.bulk_density` | (B,) | Bulk density (kg/m3) | |
|
|
| `soil.water_content` | (B,) | Water content at 33kPa (%) | |
|
|
|
|
|
## Performance |
|
|
|
|
|
### Segmentation Accuracy |
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| Overall Accuracy | 73.7% (5 epochs, nano) | |
|
|
| Full Training | 94.4% (100 epochs, A100) | |
|
|
|
|
|
### Soil Prediction (MAE) |
|
|
| Property | MAE | |
|
|
|----------|-----| |
|
|
| SOC | 0.074 (normalized) | |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Resolution**: Optimized for 64x64 pixel tiles (640m x 640m at 10m resolution) |
|
|
- **Temporal**: Best performance with 4 quarterly composites |
|
|
- **Geographic**: Trained primarily on tropical forests; may need fine-tuning for temperate/boreal forests |
|
|
- **Cloud Cover**: Relies on cloud-free composites from Google Earth Engine |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
This model was trained with environmental sustainability in mind: |
|
|
- Uses spot/preemptible GPU instances to reduce costs and energy |
|
|
- Efficient nano architecture (10.9M params) suitable for edge deployment |
|
|
- Supports forest conservation and climate monitoring applications |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@software{naturecode_earth_2024, |
|
|
title={Naturecode Earth: Multi-Modal Temporal Vision Transformer for Forest Monitoring}, |
|
|
author={Naturecode}, |
|
|
year={2024}, |
|
|
url={https://huggingface.co/naturecode/naturecode-earth} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Sentinel-2 data from ESA/Copernicus |
|
|
- Global Forest Change from University of Maryland |
|
|
- Soil data from OpenLandMap/ISRIC |
|
|
- Google Earth Engine for data processing |
|
|
|