YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Overview This project implements a satellite imagery classification pipeline using a DenseNet169 deep convolutional neural network architecture. The model is designed to classify tiles of Sentinel-2 satellite images into multiple land cover classes relevant for deforestation monitoring and land use assessment across India.

The pipeline covers data collection via Google Earth Engine (GEE), data preprocessing, model training with TensorFlow, evaluation, and saving final artifacts.

Model Architecture Backbone: DenseNet169 pre-trained on ImageNet, used as a feature extractor.

Custom head: GlobalAveragePooling2D, batch normalization, dropout layers, two fully connected dense layers with ReLU activations and L2 regularization, followed by a softmax output layer for multi-class classification.

Training involves a two-stage approach:

Freeze DenseNet base and train only the custom classification head.

Fine-tune selected deeper layers of DenseNet base by unfreezing from a configurable layer index.

Loss function: Categorical cross-entropy.

Optimizer: Adam with learning rate schedules.

Metrics tracked: Accuracy and ROC AUC.

Data and Training Input image size: 224 x 224 RGB tiles.

Number of classes: Configurable based on the dataset (default six classes including heavily_deforested_area, healthy_forest, farmland, etc.).

Data loading: TensorFlow datasets are created from CSV files listing image paths and labels; data augmentation applied during training (random flips, brightness, contrast changes).

Batch size and number of epochs configurable from CLI arguments.

Early stopping, learning rate reduction on plateau, and model checkpointing callbacks implemented for robust training.

Functionality Summary Data collection workers utilize Google Earth Engine API to download image tiles filtered for cloud coverage and quality metrics.

Tiles are geographically split into train, validation, and test sets to reduce data leakage by region.

The VanRakshakClassifier class encapsulates model building, training (with fine-tuning), evaluation (classification report, confusion matrix), and model saving with metadata.

Plotting helpers generate figures for training history and confusion matrix visualization.

Command-line interface supports parameters for service account credentials, sample counts, batch size, debug mode, dry run for local testing without GEE calls, timeout, and worker thread count.

Outputs and Artifacts Trained model saved as a .keras file including weights.

Metadata JSON containing model parameters and class names.

Training history JSON logging epoch-wise metrics.

Evaluation results JSON with test metrics, classification report per class, and confusion matrix.

Visualizations of training curves and confusion matrix saved as PNGs.

Usage Run the pipeline with optional parameters to customize data collection and training. For example:

bash python vanrakshak_fixed_pipeline.py --sa-key path/to/key.json --samples 2000 --batch-size 16 --debug The pipeline performs the following steps sequentially:

Initialize Google Earth Engine.

Create train/val/test geographic splits of sample regions.

Download and filter Sentinel-2 tiles by cloud cover and quality.

Prepare TensorFlow datasets with augmentation.

Create, train, and fine-tune DenseNet169 classifier.

Evaluate on held-out test data and save model and results.

Requirements Python 3

TensorFlow (including Keras)

scikit-learn

matplotlib

numpy, requests, Pillow

google-auth, earthengine-api

References Huang, Gao et al. "Densely Connected Convolutional Networks," CVPR 2017.

Sentinel-2 satellite data accessed via Google Earth Engine.

This README provides an overview to use, modify, or extend the VanRakshak DenseNet169 image classification pipeline for satellite data-based land cover classification.

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support