Update README.md
Browse files
README.md
CHANGED
|
@@ -7,18 +7,9 @@ base_model:
|
|
| 7 |
pipeline_tag: image-classification
|
| 8 |
library_name: keras
|
| 9 |
---
|
| 10 |
-
|
| 11 |
-
That's a fantastic final step\! A clear **README** is essential for your Hugging Face page to explain the model's purpose, performance, and usage, especially since you optimized a challenging transfer learning task.
|
| 12 |
-
|
| 13 |
-
Here is a comprehensive README template based on your final results and methodology.
|
| 14 |
-
|
| 15 |
-
-----
|
| 16 |
-
|
| 17 |
-
# Model Card: ResNet-50 Fine-Tuned for FER-2013 Facial Expression Recognition
|
| 18 |
-
|
| 19 |
## Model Description
|
| 20 |
|
| 21 |
-
This model is a **ResNet-50** deep convolutional neural network fine-tuned for the **FER-2013 (Facial Expression Recognition 2013)** dataset. The dataset consists of low-resolution (
|
| 22 |
|
| 23 |
This project focused on maximizing the performance of the pre-trained ResNet-50 architecture on this particularly challenging, noisy, and imbalanced dataset.
|
| 24 |
|
|
@@ -27,8 +18,8 @@ This project focused on maximizing the performance of the pre-trained ResNet-50
|
|
| 27 |
### Architecture
|
| 28 |
|
| 29 |
* **Base Model:** ResNet-50 (pre-trained on ImageNet).
|
| 30 |
-
* **Head:** Custom dense layers (
|
| 31 |
-
* **Transfer Learning Strategy:** **Deep Freezing**. The model base was frozen up to the `conv5` block, meaning only the final convolutional block (`conv5`) and the custom head were fine-tuned. This prevents early layers, which are optimized for high-resolution images, from being corrupted by the
|
| 32 |
|
| 33 |
### Optimization & Regularization
|
| 34 |
|
|
@@ -36,8 +27,8 @@ This project focused on maximizing the performance of the pre-trained ResNet-50
|
|
| 36 |
| :--- | :--- |
|
| 37 |
| **Class Weighting** | Applied inverse frequency weights to mitigate the severe class imbalance (e.g., Disgust is rare, Happy is abundant). |
|
| 38 |
| **Data Augmentation** | Used random flips, translations, rotations, and zooms to artificially expand the small dataset and combat overfitting. |
|
| 39 |
-
| **High Dropout** | Increased dropout to
|
| 40 |
-
| **Optimizer** | Adam with a very low fine-tuning learning rate of
|
| 41 |
|
| 42 |
## Evaluation Results
|
| 43 |
|
|
@@ -47,9 +38,9 @@ The final model achieved its **highest stability and best performance** after 50
|
|
| 47 |
|
| 48 |
| Metric | Result |
|
| 49 |
| :--- | :--- |
|
| 50 |
-
| **Test Accuracy** |
|
| 51 |
-
| **Test Loss** |
|
| 52 |
-
| **Training Accuracy (End)** |
|
| 53 |
|
| 54 |
### Per-Class F1-Scores
|
| 55 |
|
|
@@ -58,19 +49,19 @@ The F1-Score highlights the model's difficulty with ambiguous negative emotions.
|
|
| 58 |
| Emotion | F1-Score | Support (Test Count) | Notes |
|
| 59 |
| :--- | :--- | :--- | :--- |
|
| 60 |
| **Neutral** | **0.6386** | 831 | Highest precision, well-distinguished class. |
|
| 61 |
-
| **Happy** |
|
| 62 |
-
| **Disgust** |
|
| 63 |
| **Sad** | $0.3995$ | 1233 | Ambiguous. |
|
| 64 |
-
| **Surprise** |
|
| 65 |
-
| **Fear** |
|
| 66 |
| **Angry** | **0.3312** | 958 | Lowest F1-score, indicating high confusion. |
|
| 67 |
|
| 68 |
## 💡 Usage and Limitations
|
| 69 |
|
| 70 |
### Inputs
|
| 71 |
|
| 72 |
-
* **Image Format:** Grayscale (
|
| 73 |
-
* **Normalization:** Pixel values must be scaled to
|
| 74 |
|
| 75 |
### Recommended Libraries
|
| 76 |
|
|
@@ -79,7 +70,7 @@ The F1-Score highlights the model's difficulty with ambiguous negative emotions.
|
|
| 79 |
|
| 80 |
### Limitations
|
| 81 |
|
| 82 |
-
1. **Low Accuracy:** The
|
| 83 |
2. **Overfitting:** Despite aggressive regularization, the model remains highly overfit (Training vs. Test gap), which is characteristic of this dataset.
|
| 84 |
|
| 85 |
### ❓ Troubleshooting the Error
|
|
|
|
| 7 |
pipeline_tag: image-classification
|
| 8 |
library_name: keras
|
| 9 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
## Model Description
|
| 11 |
|
| 12 |
+
This model is a **ResNet-50** deep convolutional neural network fine-tuned for the **FER-2013 (Facial Expression Recognition 2013)** dataset. The dataset consists of low-resolution (48x48) grayscale images of faces categorized into seven core emotional states.
|
| 13 |
|
| 14 |
This project focused on maximizing the performance of the pre-trained ResNet-50 architecture on this particularly challenging, noisy, and imbalanced dataset.
|
| 15 |
|
|
|
|
| 18 |
### Architecture
|
| 19 |
|
| 20 |
* **Base Model:** ResNet-50 (pre-trained on ImageNet).
|
| 21 |
+
* **Head:** Custom dense layers (224 units) with a high 0.5 dropout rate.
|
| 22 |
+
* **Transfer Learning Strategy:** **Deep Freezing**. The model base was frozen up to the `conv5` block, meaning only the final convolutional block (`conv5`) and the custom head were fine-tuned. This prevents early layers, which are optimized for high-resolution images, from being corrupted by the 48x48 input.
|
| 23 |
|
| 24 |
### Optimization & Regularization
|
| 25 |
|
|
|
|
| 27 |
| :--- | :--- |
|
| 28 |
| **Class Weighting** | Applied inverse frequency weights to mitigate the severe class imbalance (e.g., Disgust is rare, Happy is abundant). |
|
| 29 |
| **Data Augmentation** | Used random flips, translations, rotations, and zooms to artificially expand the small dataset and combat overfitting. |
|
| 30 |
+
| **High Dropout** | Increased dropout to 0.5 to aggressively regularize the model and prevent the divergence seen in earlier training runs. |
|
| 31 |
+
| **Optimizer** | Adam with a very low fine-tuning learning rate of 5e-6. |
|
| 32 |
|
| 33 |
## Evaluation Results
|
| 34 |
|
|
|
|
| 38 |
|
| 39 |
| Metric | Result |
|
| 40 |
| :--- | :--- |
|
| 41 |
+
| **Test Accuracy** | **45.70\%** |
|
| 42 |
+
| **Test Loss** | 1.4929 |
|
| 43 |
+
| **Training Accuracy (End)** | 63.25\% |
|
| 44 |
|
| 45 |
### Per-Class F1-Scores
|
| 46 |
|
|
|
|
| 49 |
| Emotion | F1-Score | Support (Test Count) | Notes |
|
| 50 |
| :--- | :--- | :--- | :--- |
|
| 51 |
| **Neutral** | **0.6386** | 831 | Highest precision, well-distinguished class. |
|
| 52 |
+
| **Happy** | 0.6037 | 1774 | Strongest recall, the most abundant class. |
|
| 53 |
+
| **Disgust** | 0.4659 | 111 | Significantly improved performance on this rare class. |
|
| 54 |
| **Sad** | $0.3995$ | 1233 | Ambiguous. |
|
| 55 |
+
| **Surprise** | 0.3531 | 1247 | Ambiguous. |
|
| 56 |
+
| **Fear** | 0.3374 | 1024 | Ambiguous. |
|
| 57 |
| **Angry** | **0.3312** | 958 | Lowest F1-score, indicating high confusion. |
|
| 58 |
|
| 59 |
## 💡 Usage and Limitations
|
| 60 |
|
| 61 |
### Inputs
|
| 62 |
|
| 63 |
+
* **Image Format:** Grayscale (48x48 pixels).
|
| 64 |
+
* **Normalization:** Pixel values must be scaled to [0, 1] (by dividing by 255.0).
|
| 65 |
|
| 66 |
### Recommended Libraries
|
| 67 |
|
|
|
|
| 70 |
|
| 71 |
### Limitations
|
| 72 |
|
| 73 |
+
1. **Low Accuracy:** The 45.70\% accuracy is limited by the **low resolution** (48x48) and **noisy labels** of the FER-2013 dataset. It is not comparable to modern human performance (65\%-68\% on FER-2013) or models trained on high-quality, high-resolution "in-the-wild" datasets like AffectNet.
|
| 74 |
2. **Overfitting:** Despite aggressive regularization, the model remains highly overfit (Training vs. Test gap), which is characteristic of this dataset.
|
| 75 |
|
| 76 |
### ❓ Troubleshooting the Error
|