momererkoc
/

cloud_classifier

@@ -27,4 +27,189 @@ model-index:
         source:
           name: Kaggle
           url: https://www.kaggle.com/code/momerer/ensemble-learning-cloud-classifier-model-youthai/
----

         source:
           name: Kaggle
           url: https://www.kaggle.com/code/momerer/ensemble-learning-cloud-classifier-model-youthai/
+---
+# Ensemble Learning Cloud Classifier
+![YouthAI Initiative](https://youthaiinitiative.com/wp-content/uploads/2023/09/Adsiz-2100-x-2970-piksel-scaled-e1759947420412-1024x541.png)
+> **Note:** This project was developed as a capstone assignment for the **Youth AI Initiative**. It demonstrates the application of advanced Deep Learning techniques (Transfer Learning and Stacking Ensembles) to solve meteorological classification problems.
+## Overview
+This project implements a robust **Ensemble Learning** model to classify images of clouds into 7 distinct meteorological categories. By leveraging the power of **Transfer Learning**, we combine three state-of-the-art Convolutional Neural Networks (ResNet50, VGG16, and InceptionV3) to extract features, which are then fed into a Meta-Learner (Neural Network) to make the final prediction.
+This "Stacked Generalization" approach achieves higher accuracy and stability compared to using individual models alone, effectively handling the visual complexity and ambiguity often found in cloud formations.
+## Objectives
+-   To classify cloud types from images with high accuracy.
+-   To mitigate the issue of limited training data using **Data Augmentation** and **Transfer Learning**.
+-   To address class imbalance using **Weighted Loss Functions**.
+-   To demonstrate the effectiveness of stacking multiple weak(er) learners to create a strong meta-learner.
+## Dataset
+The dataset consists of **960 images** divided into 7 classes. The data was split into Training (70%), Validation (15%), and Testing (15%) sets.
+**Classes:**
+1.  `cirriform clouds`
+2.  `clear sky`
+3.  `cumulonimbus clouds`
+4.  `cumulus clouds`
+5.  `high cumuliform clouds`
+6.  `stratiform clouds`
+7.  `stratocumulus clouds`
+## Model Architecture
+The solution uses a **Stacking Ensemble** architecture:
+### Level 0: Base Learners
+Three pre-trained models (weights from ImageNet) were used as feature extractors. The top layers were removed and replaced with a custom classification head:
+1.  **ResNet50** (Input: 224x224)
+2.  **VGG16** (Input: 224x224)
+3.  **InceptionV3** (Input: 299x299)
+**Custom Head Structure:**
+-   `GlobalAveragePooling2D`
+-   `Dense(256, activation='relu')` with L2 Regularization (0.01)
+-   `Dropout(0.6)` (To prevent overfitting)
+-   `Dense(7, activation='softmax')`
+### Level 1: Meta-Learner
+The predictions (probability vectors) from the three base models are concatenated to form a meta-input vector (size 21). This is fed into a dense neural network:
+-   **Input:** Concatenated Predictions
+-   **Hidden Layer:** Dense(16, relu) + Dropout(0.4)
+-   **Output:** Final Classification
+## Technical Implementation Details
+### Data Preprocessing
+To handle the small dataset size and prevent overfitting, aggressive **Data Augmentation** was applied during training:
+-   Rotation range: 40°
+-   Width/Height shift: 0.25
+-   Shear/Zoom: 0.25 / 0.3
+-   Horizontal & Vertical Flips
+-   Brightness adjustment: [0.7, 1.3]
+### Class Balancing
+Class weights were computed using `sklearn.utils.class_weight` to penalize the model more for misclassifying rare classes (e.g., _Cumulonimbus_ which had a weight of ~5.33).
+### Hyperparameters
+-   **Optimizer:** Adam (Learning Rate: 0.0001 for base, 0.001 for meta)
+-   **Loss Function:** Categorical Crossentropy
+-   **Batch Size:** 64
+-   **Epochs:** 75 (with Early Stopping and ReduceLROnPlateau)
+## Results
+The Ensemble Meta-Model outperformed the individual base models on the test set.
+-   **Final Accuracy:** 86%
+-   **F1-Score (Weighted):** 0.85
+### Classification Report
+Detailed performance metrics by class:
+```
+                        precision    recall  f1-score   support
+      cirriform clouds       0.87      0.95      0.91        21
+             clear sky       1.00      1.00      1.00        18
+   cumulonimbus clouds       0.00      0.00      0.00         4
+        cumulus clouds       0.81      0.94      0.87        32
+high cumuliform clouds       0.89      0.86      0.87        36
+     stratiform clouds       1.00      0.85      0.92        13
+  stratocumulus clouds       0.70      0.70      0.70        20
+              accuracy                           0.86       144
+             macro avg       0.75      0.76      0.75       144
+          weighted avg       0.84      0.86      0.85       144
+```
+### Performance Visualizations
+#### Training vs Validation Accuracy
+![Train/Val Acc](https://www.kaggleusercontent.com/kf/280733528/eyJhbGciOiJkaXIiLCJlbmMiOiJBMTI4Q0JDLUhTMjU2In0..w10CR0iyNAplIYsVSY1hoQ.d1w4RF3N7X0cuybdfDu_F4MyMcaUKRqyNb6i6NXM_eh6KbU8Xfp1wcJEgBs9QkU5UDsyCU2Bm7dj3ap3rRB8eLHGcrdpza-gakim7P_Szcj9V2tiU8sWEW5niEltG4S9BPiDBqVBtKzunbYBhSua6j5-OibvthpoEggxQAszOdHgR2MBFb_8r0WXTgrn-g9bPQtRbUVOyS1dj_xdXdvq1U6TnIYtiavLBySamAv6fVipPcfMfe3MHmeg4RJRyceaPyM9o7d_6QAC4Ta0EBxcu0qXYgiBI7ve_0bJNskxB1oVxVkKoOqwaFEige9xS1ybl3jgjy8Tog7jGz7JiaDysYOMIpaJwgo3vWn_PHjtLkRac_d5l1zNXErl02eeA7TakIG8tffXHVzKLH6vIQcLkjLswCl6xq5tYWzaAV_XIMZgkIYdwyDuy_j5BCIzYbdOCMhWxpeY26WB_NJkGOEZ4gYuyywgwYij8mqU3yP6nWSgES7k2TUr_YRTSlcQG-pwHtjG4az-rBaVYYl8vrLGJIXcQahHKq5_tQIrGJOD8SWWBKPcKo7nlcEa1xA5FPbR8vZscd7all_-oINspprqbLcCjy151T8GHrJkLlpZpr1ZLFKmtgXKZPanGc9UTN6zVBu1RlgkJNcDgHTOvRAUbfodr8x71xKsvVbX0ndmOTM.8eAEKrO9BxZmtjpkUPwo0A/__results___files/__results___13_0.png)
+#### Confusion Matrix
+![Conf Matrix](https://www.kaggleusercontent.com/kf/280733528/eyJhbGciOiJkaXIiLCJlbmMiOiJBMTI4Q0JDLUhTMjU2In0..w10CR0iyNAplIYsVSY1hoQ.d1w4RF3N7X0cuybdfDu_F4MyMcaUKRqyNb6i6NXM_eh6KbU8Xfp1wcJEgBs9QkU5UDsyCU2Bm7dj3ap3rRB8eLHGcrdpza-gakim7P_Szcj9V2tiU8sWEW5niEltG4S9BPiDBqVBtKzunbYBhSua6j5-OibvthpoEggxQAszOdHgR2MBFb_8r0WXTgrn-g9bPQtRbUVOyS1dj_xdXdvq1U6TnIYtiavLBySamAv6fVipPcfMfe3MHmeg4RJRyceaPyM9o7d_6QAC4Ta0EBxcu0qXYgiBI7ve_0bJNskxB1oVxVkKoOqwaFEige9xS1ybl3jgjy8Tog7jGz7JiaDysYOMIpaJwgo3vWn_PHjtLkRac_d5l1zNXErl02eeA7TakIG8tffXHVzKLH6vIQcLkjLswCl6xq5tYWzaAV_XIMZgkIYdwyDuy_j5BCIzYbdOCMhWxpeY26WB_NJkGOEZ4gYuyywgwYij8mqU3yP6nWSgES7k2TUr_YRTSlcQG-pwHtjG4az-rBaVYYl8vrLGJIXcQahHKq5_tQIrGJOD8SWWBKPcKo7nlcEa1xA5FPbR8vZscd7all_-oINspprqbLcCjy151T8GHrJkLlpZpr1ZLFKmtgXKZPanGc9UTN6zVBu1RlgkJNcDgHTOvRAUbfodr8x71xKsvVbX0ndmOTM.8eAEKrO9BxZmtjpkUPwo0A/__results___files/__results___13_2.png)
+## Installation & Usage
+### Prerequisites
+```
+pip install tensorflow numpy pandas matplotlib seaborn scikit-learn pillow requests
+```
+### Training
+The training pipeline is automated:
+1.  Load and split data.
+2.  Calculate class weights.
+3.  Train ResNet50, VGG16, and InceptionV3 individually.
+4.  Generate validation predictions from all three models.
+5.  Train the Meta-Learner on these predictions.
+## Credits
+-   **Author:** Muhammed Ömer ERKOÇ
+-   **Organization:** Youth AI Initiative
+-   **Dataset Source:** [SkyVision Cloud Dataset](https://www.kaggle.com/datasets/zeesolver/cloiud-dataset)
+_This project is part of the educational curriculum at the Youth AI Initiative, fostering the next generation of AI specialists._