|
|
--- |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- image-classification |
|
|
- aerial-imagery |
|
|
- robotics |
|
|
- computer-vision |
|
|
datasets: |
|
|
- aid_dataset |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
model-index: |
|
|
- name: Aerial Image Classification CNN |
|
|
results: |
|
|
- task: |
|
|
type: image-classification |
|
|
name: Image Classification |
|
|
dataset: |
|
|
name: AID (Aerial Image Dataset) |
|
|
type: aid |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.9280 |
|
|
name: Test Accuracy |
|
|
- type: f1 |
|
|
value: 0.93 |
|
|
name: Macro F1 |
|
|
--- |
|
|
|
|
|
# Model Card for Aerial Image Classification (CNN & Classic ML) |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
This repository contains two types of models for classifying aerial images from the **AID dataset**: |
|
|
1. **Convolutional Neural Network (CNN):** A lightweight ResNet-based model. |
|
|
2. **Classic Machine Learning:** A Bag of Features (BoF) pipeline using SIFT descriptors and Softmax Regression. |
|
|
|
|
|
These models were developed as part of a machine learning assignment to evaluate deep learning approaches against classical computer vision methods. |
|
|
|
|
|
- **Model types:** |
|
|
- CNN (PyTorch) |
|
|
- BoVW + ML algorithm (Scikit-learn/Joblib) |
|
|
- **Language(s):** English |
|
|
- **Resources:** |
|
|
- CNN: ~1M parameters, ~16MB. |
|
|
- Classic ML: ~100MB (includes vocabulary). |
|
|
|
|
|
### Model Architecture |
|
|
The architecture consists of an initial convolution layer followed by **three residual blocks**. |
|
|
- **Residual Blocks:** Enable deeper feature extraction without degradation. |
|
|
- **Layers:** Convolution, Batch Normalization, Max-Pooling. |
|
|
- **Input:** $600 \times 600$ pixel images (resized as needed by the pipeline). |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
The model is intended for classifying high-resolution aerial scenes into one of 30 categories. It is suitable for: |
|
|
- Autonomous UAV navigation and mapping. |
|
|
- Environmental monitoring. |
|
|
- Land use classification. |
|
|
|
|
|
### Downstream Use |
|
|
This model can be fine-tuned on other aerial or satellite imagery datasets. |
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was trained on the **Aerial Image Dataset (AID)**. |
|
|
- **Source:** Google Earth imagery. |
|
|
- **Size:** 10,000 images. |
|
|
- **Classes:** 30 (e.g., Airport, Beach, Forest, Industrial, etc.). |
|
|
- **Split:** 90% Training / 10% Test. |
|
|
|
|
|
## Performance |
|
|
|
|
|
The CNN significantly outperformed classical Machine Learning methods (SVM, Random Forest, etc.) evaluated on the same dataset. |
|
|
|
|
|
| Metric | Value | |
|
|
| :--- | :--- | |
|
|
| **Test Accuracy** | **92.80%** | |
|
|
| **Macro Average** | 0.93 | |
|
|
| **Weighted Average** | 0.93 | |
|
|
|
|
|
### Comparison with Classical Methods |
|
|
| Model | Test Accuracy | |
|
|
| :--- | :--- | |
|
|
| **CNN (This Model)** | **0.9280** | |
|
|
| SVM (RBF Kernel) | 0.7120 | |
|
|
| Softmax Regression | 0.6580 | |
|
|
| Random Forest | 0.5680 | |
|
|
| Naïve Bayes | 0.5280 | |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Data Bias:** The model is trained on Google Earth imagery (AID), so it may not generalize perfectly to aerial images with significantly different sensors, resolutions, or lighting conditions. |
|
|
- **Scope:** Limited to the 30 classes defined in the AID dataset. |
|
|
|
|
|
## How to Get Started |
|
|
|
|
|
You can use the provided `demo.ipynb` notebook for a complete example. Below is a snippet to load both models. |
|
|
|
|
|
### 1. Load Classic ML Model |
|
|
```python |
|
|
from huggingface_hub import hf_hub_download |
|
|
import joblib |
|
|
|
|
|
# Download model |
|
|
model_path = hf_hub_download( |
|
|
repo_id="JavideuS/aid-image-classification", |
|
|
filename="classicML/models/bovw_softmax.pkl" |
|
|
) |
|
|
|
|
|
# Load pipeline |
|
|
bundle = joblib.load(model_path) |
|
|
pipeline = bundle['pipeline'] |
|
|
label_encoder = bundle['label_encoder'] |
|
|
|
|
|
# Predict |
|
|
# pipeline.predict(["path/to/image.jpg"]) |
|
|
``` |
|
|
|
|
|
### 2. Load CNN Model |
|
|
```python |
|
|
import torch |
|
|
from NeuralNets.model import PiattiCNN # Ensure you have the model definition |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
# Download checkpoints |
|
|
checkpoints_path = hf_hub_download( |
|
|
repo_id="JavideuS/aid-image-classification", |
|
|
filename="neuralNet/models/PiattiVL_v0.69.pth" |
|
|
) |
|
|
|
|
|
# Load model |
|
|
checkpoints = torch.load(checkpoints_path, map_location='cpu') |
|
|
model = PiattiCNN(num_classes=checkpoints['num_classes']) |
|
|
model.load_state_dict(checkpoints['model_state_dict']) |
|
|
model.eval() |
|
|
|
|
|
# Inference |
|
|
# ... |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model or the AID dataset, please cite the original dataset paper: |
|
|
|
|
|
```bibtex |
|
|
@article{aid_dataset, |
|
|
title={AID: A Scene Classification Dataset}, |
|
|
author={Xia, Gui-Song and et al.}, |
|
|
journal={IEEE Transactions on Geoscience and Remote Sensing}, |
|
|
year={2017} |
|
|
} |
|
|
``` |
|
|
|