--- language: - en tags: - image-classification - aerial-imagery - robotics - computer-vision datasets: - aid_dataset metrics: - accuracy - f1 model-index: - name: Aerial Image Classification CNN results: - task: type: image-classification name: Image Classification dataset: name: AID (Aerial Image Dataset) type: aid metrics: - type: accuracy value: 0.9280 name: Test Accuracy - type: f1 value: 0.93 name: Macro F1 --- # Model Card for Aerial Image Classification (CNN & Classic ML) ## Model Details ### Model Description This repository contains two types of models for classifying aerial images from the **AID dataset**: 1. **Convolutional Neural Network (CNN):** A lightweight ResNet-based model. 2. **Classic Machine Learning:** A Bag of Features (BoF) pipeline using SIFT descriptors and Softmax Regression. These models were developed as part of a machine learning assignment to evaluate deep learning approaches against classical computer vision methods. - **Model types:** - CNN (PyTorch) - BoVW + ML algorithm (Scikit-learn/Joblib) - **Language(s):** English - **Resources:** - CNN: ~1M parameters, ~16MB. - Classic ML: ~100MB (includes vocabulary). ### Model Architecture The architecture consists of an initial convolution layer followed by **three residual blocks**. - **Residual Blocks:** Enable deeper feature extraction without degradation. - **Layers:** Convolution, Batch Normalization, Max-Pooling. - **Input:** $600 \times 600$ pixel images (resized as needed by the pipeline). ## Uses ### Direct Use The model is intended for classifying high-resolution aerial scenes into one of 30 categories. It is suitable for: - Autonomous UAV navigation and mapping. - Environmental monitoring. - Land use classification. ### Downstream Use This model can be fine-tuned on other aerial or satellite imagery datasets. ## Training Data The model was trained on the **Aerial Image Dataset (AID)**. - **Source:** Google Earth imagery. - **Size:** 10,000 images. - **Classes:** 30 (e.g., Airport, Beach, Forest, Industrial, etc.). - **Split:** 90% Training / 10% Test. ## Performance The CNN significantly outperformed classical Machine Learning methods (SVM, Random Forest, etc.) evaluated on the same dataset. | Metric | Value | | :--- | :--- | | **Test Accuracy** | **92.80%** | | **Macro Average** | 0.93 | | **Weighted Average** | 0.93 | ### Comparison with Classical Methods | Model | Test Accuracy | | :--- | :--- | | **CNN (This Model)** | **0.9280** | | SVM (RBF Kernel) | 0.7120 | | Softmax Regression | 0.6580 | | Random Forest | 0.5680 | | Naïve Bayes | 0.5280 | ## Limitations - **Data Bias:** The model is trained on Google Earth imagery (AID), so it may not generalize perfectly to aerial images with significantly different sensors, resolutions, or lighting conditions. - **Scope:** Limited to the 30 classes defined in the AID dataset. ## How to Get Started You can use the provided `demo.ipynb` notebook for a complete example. Below is a snippet to load both models. ### 1. Load Classic ML Model ```python from huggingface_hub import hf_hub_download import joblib # Download model model_path = hf_hub_download( repo_id="JavideuS/aid-image-classification", filename="classicML/models/bovw_softmax.pkl" ) # Load pipeline bundle = joblib.load(model_path) pipeline = bundle['pipeline'] label_encoder = bundle['label_encoder'] # Predict # pipeline.predict(["path/to/image.jpg"]) ``` ### 2. Load CNN Model ```python import torch from NeuralNets.model import PiattiCNN # Ensure you have the model definition from huggingface_hub import hf_hub_download # Download checkpoints checkpoints_path = hf_hub_download( repo_id="JavideuS/aid-image-classification", filename="neuralNet/models/PiattiVL_v0.69.pth" ) # Load model checkpoints = torch.load(checkpoints_path, map_location='cpu') model = PiattiCNN(num_classes=checkpoints['num_classes']) model.load_state_dict(checkpoints['model_state_dict']) model.eval() # Inference # ... ``` ## Citation If you use this model or the AID dataset, please cite the original dataset paper: ```bibtex @article{aid_dataset, title={AID: A Scene Classification Dataset}, author={Xia, Gui-Song and et al.}, journal={IEEE Transactions on Geoscience and Remote Sensing}, year={2017} } ```