KPLabs commited on
Commit
3e10a2c
·
verified ·
1 Parent(s): 3daddbd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -17
README.md CHANGED
@@ -1,36 +1,67 @@
1
  ---
2
  base_model:
3
- - ibm-esa-geospatial/TerraMind-1.0-base
4
  pipeline_tag: image-classification
5
  tags:
6
- - methane,
7
- - detection,
8
- - geospatial,
 
9
  ---
 
10
  # FAST-EO Use Case 2 - Methane Detection
11
 
12
- This directory contains all data and code necessary to recreate experiments conducted for fine-tuning Terramind-Base to detect methane in satellite images. It includes five distinct experiments along with their corresponding datasets. The attached Methane_benchmark_patches_summary_v3.xlsx file provides descriptions for every patch extracted from the Methane Benchmark Dataset (MBD) and defines the fold splits to ensure non-overlapping data. This Excel file is used by the runner scripts to partition the data, typically reserving one fold for testing.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- Each script includes usage instructions which can be accessed by applying the --help (or -h) flag.
15
 
16
- ## Important: Ensure the Terramind package is installed before running any experiments.
 
 
 
17
 
18
- ## Experiment 1: Fine tuning on Methane Benchmark Dataset
19
 
20
- The first experiment is fine tuning the model on Methane Benchmark Dataset. The dataset has been attatched in the directory `MBD_nan_S2_zscore`, and has been already normalized. The code for running the training is located in the `classification` directory, along with neccessary `dataset` and `dataloader` classes.
 
 
21
 
22
- ## Experiment 2: Fine tuning on MBD with text captions
23
 
24
- This experiment contains a modified verion of the Terramind-Based model, which concatinates the textual embeddings of the text captions for every image, with the visual embeddings of the base model. The text embeddings are calculated using the `all-MiniLM-L6-v2` model. All the code, along with embeddings calculation, and data, is available in the `classification_with_text` directory. The original captions are located in `classification_with_text/MBD_text`, and the embeddings are located inside the `combined_caption_embeddings.csv` file.
25
 
26
- ## Experiment 3: Fine tuning and inference on Sentinel 2 with simulated atmospheric conditions
27
 
28
- This experiment checks how the Terramind-Base behaves on the Sentinel-2 data with simulated atmospheric conditions. The simulated data is both in the Top-of-Atmosphere and Bottom-of-Atmsphere variants. The model can be both trained on this data, or only run on it, to test how good it is at generalization when trained on different data.
29
 
30
- ## Experiment 4: Fine tuning and inference on Intuition 1 with simulated atmosphric conditions
31
 
32
- This experiment checks how the Terramind-Base behaves on the Intuition-1 data with simulated atmospheric conditions. The simulated data is both in the Top-of-Atmosphere and Bottom-of-Atmsphere variants. The model can be both trained on this data, or only run on it, to test how good it is at generalization when trained on different data.
33
 
34
- ## Experiment 5: Testing the detector on urban dataset without methane
35
 
36
- The urban dataset has been prepared to check whether the models really learned to detect methane from multispectral data or just look for urban signatures in the images. All of the images in this dataset do not contain methane, the goal is to run the models and see how many false positives are returned. Python script for loading and running the models were attatched.
 
 
1
  ---
2
  base_model:
3
+ - ibm-esa-geospatial/TerraMind-1.0-base
4
  pipeline_tag: image-classification
5
  tags:
6
+ - methane
7
+ - detection
8
+ - geospatial
9
+ - terramind
10
  ---
11
+
12
  # FAST-EO Use Case 2 - Methane Detection
13
 
14
+ This repository contains data and code to reproduce experiments for fine-tuning **TerraMind-Base** to detect methane-related signatures in multispectral imagery. It includes multiple experiment variants and their corresponding datasets.
15
+
16
+ The file `Methane_benchmark_patches_summary_v3.xlsx` provides per-patch descriptions and defines the **fold splits** used to ensure non-overlapping partitions. Runner scripts use this Excel file to build train/val/test splits, typically reserving one fold for testing.
17
+
18
+ All scripts provide usage instructions via `--help` (or `-h`).
19
+
20
+ ## Requirements
21
+
22
+ - Install the TerraMind/Terratorch stack used by the project before running experiments.
23
+ - Ensure your environment has the required dependencies for data loading and training.
24
+
25
+ ## Experiments
26
+
27
+ ### Experiment 1: Fine-tuning on Methane Benchmark Dataset (MBD)
28
+
29
+ Fine-tune TerraMind-Base on the Methane Benchmark Dataset. The normalized dataset is provided in:
30
+
31
+ - `MBD_nan_S2_zscore/`
32
+
33
+ Training code is located in:
34
+
35
+ - `classification/` (includes dataset and dataloader classes)
36
+
37
+ ### Experiment 2: Fine-tuning on MBD with text captions
38
 
39
+ This experiment modifies the TerraMind-Base model to concatenate text-caption embeddings with visual embeddings.
40
 
41
+ - Caption embeddings are computed with `all-MiniLM-L6-v2`.
42
+ - Code and resources are in `classification_with_text/`.
43
+ - Original captions: `classification_with_text/MBD_text/`
44
+ - Precomputed embeddings: `combined_caption_embeddings.csv`
45
 
46
+ ### Experiment 3: Sentinel-2 with simulated atmospheric conditions
47
 
48
+ Evaluate generalization on Sentinel-2 data with simulated atmospheric conditions in both:
49
+ - Top-of-Atmosphere (TOA)
50
+ - Bottom-of-Atmosphere (BOA)
51
 
52
+ The model can be trained on this simulated data or used only for inference to test cross-domain robustness.
53
 
54
+ ### Experiment 4: Intuition-1 with simulated atmospheric conditions
55
 
56
+ Analogous to Experiment 3, but using Intuition-1 imagery with simulated atmospheric conditions (TOA and BOA). This experiment tests robustness under domain shift.
57
 
58
+ ### Experiment 5: Urban dataset without methane (false-positive stress test)
59
 
60
+ A control dataset containing only urban imagery without methane is used to check whether models learn methane-specific cues rather than urban signatures. The goal is to quantify false positives.
61
 
62
+ Scripts for loading and running inference on this dataset are provided in the repository.
63
 
64
+ ## Notes
65
 
66
+ - Use `--help` on each runner/training script to see available options.
67
+ - Keep fold definitions consistent with `Methane_benchmark_patches_summary_v3.xlsx` to ensure comparable results across experiments.