Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: mit
|
| 4 |
+
tags:
|
| 5 |
+
- audio-classification
|
| 6 |
+
- math-rock
|
| 7 |
+
- midwest-emo
|
| 8 |
+
- mbti
|
| 9 |
+
- tensorflow
|
| 10 |
+
- keras
|
| 11 |
+
metrics:
|
| 12 |
+
- accuracy
|
| 13 |
+
datasets:
|
| 14 |
+
- anggars/neural-mathrock
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
# Neural Mathrock Classifier (v1.0)
|
| 18 |
+
|
| 19 |
+
This model is a multi-output Custom Convolutional Neural Network (CNN) designed to analyze audio characteristics specifically within the Math Rock and Midwest Emo genres.
|
| 20 |
+
|
| 21 |
+
## Model Outputs
|
| 22 |
+
The architecture consists of five dedicated output heads, providing simultaneous classification for:
|
| 23 |
+
1. **MBTI**: Personality type association based on musical patterns.
|
| 24 |
+
2. **Emotion**: Emotional state detection (e.g., Fear, Sadness, Happiness).
|
| 25 |
+
3. **Audio Vibe**: General atmosphere and sonic texture.
|
| 26 |
+
4. **Intensity**: Aggression and energy levels.
|
| 27 |
+
5. **Tempo**: Rhythmic speed classification (Slow, Medium, Fast).
|
| 28 |
+
|
| 29 |
+
## Technical Specifications
|
| 30 |
+
- **Input Shape**: 128x128x3 (Mel-Spectrograms)
|
| 31 |
+
- **Framework**: TensorFlow 2.x / Keras 3
|
| 32 |
+
- **Architecture**: Sequential CNN with Batch Normalization, Global Average Pooling, and Dropout layers for regularization.
|
| 33 |
+
- **Optimization**: Adam Optimizer with Sparse Categorical Crossentropy loss for all heads.
|
| 34 |
+
|
| 35 |
+
## Accuracy Performance
|
| 36 |
+
Based on the final training logs (Epoch 20):
|
| 37 |
+
- **Intensity/Tempo**: ~75-77%
|
| 38 |
+
- **MBTI/Emotion**: ~55-63% (Outperforming baseline random classification for 16-class MBTI).
|
| 39 |
+
|
| 40 |
+
## Files
|
| 41 |
+
- `neural_mathrock_model.keras`: Trained weights and model architecture.
|
| 42 |
+
- `neural_mathrock_labels.pkl`: Pickle file containing label mappings for decoding predictions.
|
| 43 |
+
|
| 44 |
+
## Usage
|
| 45 |
+
Preprocessing involves converting raw audio to Mel-Spectrograms at a sample rate of 22050 Hz, normalized to a 128x128 resolution. Use the provided pickle file to map the integer outputs to their respective categorical strings.
|