File size: 4,388 Bytes
deca757
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22b95de
deca757
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
language:
- en
tags:
- image-classification
- aerial-imagery
- robotics
- computer-vision
datasets:
- aid_dataset
metrics:
- accuracy
- f1
model-index:
- name: Aerial Image Classification CNN
  results:
  - task:
      type: image-classification
      name: Image Classification
    dataset:
      name: AID (Aerial Image Dataset)
      type: aid
    metrics:
      - type: accuracy
        value: 0.9280
        name: Test Accuracy
      - type: f1
        value: 0.93
        name: Macro F1
---

# Model Card for Aerial Image Classification (CNN & Classic ML)

## Model Details

### Model Description
This repository contains two types of models for classifying aerial images from the **AID dataset**:
1.  **Convolutional Neural Network (CNN):** A lightweight ResNet-based model.
2.  **Classic Machine Learning:** A Bag of Features (BoF) pipeline using SIFT descriptors and Softmax Regression.

These models were developed as part of a machine learning assignment to evaluate deep learning approaches against classical computer vision methods.

- **Model types:** 
    - CNN (PyTorch)
    - BoVW + ML algorithm (Scikit-learn/Joblib)
- **Language(s):** English
- **Resources:** 
    - CNN: ~1M parameters, ~16MB.
    - Classic ML: ~100MB (includes vocabulary).

### Model Architecture
The architecture consists of an initial convolution layer followed by **three residual blocks**.
- **Residual Blocks:** Enable deeper feature extraction without degradation.
- **Layers:** Convolution, Batch Normalization, Max-Pooling.
- **Input:** $600 \times 600$ pixel images (resized as needed by the pipeline).

## Uses

### Direct Use
The model is intended for classifying high-resolution aerial scenes into one of 30 categories. It is suitable for:
- Autonomous UAV navigation and mapping.
- Environmental monitoring.
- Land use classification.

### Downstream Use
This model can be fine-tuned on other aerial or satellite imagery datasets.

## Training Data

The model was trained on the **Aerial Image Dataset (AID)**.
- **Source:** Google Earth imagery.
- **Size:** 10,000 images.
- **Classes:** 30 (e.g., Airport, Beach, Forest, Industrial, etc.).
- **Split:** 90% Training / 10% Test.

## Performance

The CNN significantly outperformed classical Machine Learning methods (SVM, Random Forest, etc.) evaluated on the same dataset.

| Metric | Value |
| :--- | :--- |
| **Test Accuracy** | **92.80%** |
| **Macro Average** | 0.93 |
| **Weighted Average** | 0.93 |

### Comparison with Classical Methods
| Model | Test Accuracy |
| :--- | :--- |
| **CNN (This Model)** | **0.9280** |
| SVM (RBF Kernel) | 0.7120 |
| Softmax Regression | 0.6580 |
| Random Forest | 0.5680 |
| Naïve Bayes | 0.5280 |

## Limitations

- **Data Bias:** The model is trained on Google Earth imagery (AID), so it may not generalize perfectly to aerial images with significantly different sensors, resolutions, or lighting conditions.
- **Scope:** Limited to the 30 classes defined in the AID dataset.

## How to Get Started

You can use the provided `demo.ipynb` notebook for a complete example. Below is a snippet to load both models.

### 1. Load Classic ML Model
```python
from huggingface_hub import hf_hub_download
import joblib

# Download model
model_path = hf_hub_download(
    repo_id="JavideuS/aid-image-classification",
    filename="classicML/models/bovw_softmax.pkl"
)

# Load pipeline
bundle = joblib.load(model_path)
pipeline = bundle['pipeline']
label_encoder = bundle['label_encoder']

# Predict
# pipeline.predict(["path/to/image.jpg"])
```

### 2. Load CNN Model
```python
import torch
from NeuralNets.model import PiattiCNN # Ensure you have the model definition
from huggingface_hub import hf_hub_download

# Download checkpoints
checkpoints_path = hf_hub_download(
    repo_id="JavideuS/aid-image-classification",
    filename="neuralNet/models/PiattiVL_v0.69.pth"
)

# Load model
checkpoints = torch.load(checkpoints_path, map_location='cpu')
model = PiattiCNN(num_classes=checkpoints['num_classes'])
model.load_state_dict(checkpoints['model_state_dict'])
model.eval()

# Inference
# ...
```

## Citation

If you use this model or the AID dataset, please cite the original dataset paper:

```bibtex
@article{aid_dataset,
  title={AID: A Scene Classification Dataset},
  author={Xia, Gui-Song and et al.},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  year={2017}
}
```