| --- |
| language: en |
| license: mit |
| tags: |
| - audio-classification |
| - math-rock |
| - midwest-emo |
| - mbti |
| - tensorflow |
| - keras |
| metrics: |
| - accuracy |
| datasets: |
| - anggars/neural-mathrock |
| --- |
| |
| # Neural Mathrock Classifier (v1.0) |
|
|
| This model is a multi-output Custom Convolutional Neural Network (CNN) designed to analyze audio characteristics specifically within the Math Rock and Midwest Emo genres. |
|
|
| ## Model Outputs |
| The architecture consists of five dedicated output heads, providing simultaneous classification for: |
| 1. **MBTI**: Personality type association based on musical patterns. |
| 2. **Emotion**: Emotional state detection (e.g., Fear, Sadness, Happiness). |
| 3. **Audio Vibe**: General atmosphere and sonic texture. |
| 4. **Intensity**: Aggression and energy levels. |
| 5. **Tempo**: Rhythmic speed classification (Slow, Medium, Fast). |
|
|
| ## Technical Specifications |
| - **Input Shape**: 128x128x3 (Mel-Spectrograms) |
| - **Framework**: TensorFlow 2.x / Keras 3 |
| - **Architecture**: Sequential CNN with Batch Normalization, Global Average Pooling, and Dropout layers for regularization. |
| - **Optimization**: Adam Optimizer with Sparse Categorical Crossentropy loss for all heads. |
|
|
| ## Accuracy Performance |
| Based on the final training logs (Epoch 20): |
| - **Intensity/Tempo**: ~75-77% |
| - **MBTI/Emotion**: ~55-63% (Outperforming baseline random classification for 16-class MBTI). |
|
|
| ## Files |
| - `neural_mathrock_model.keras`: Trained weights and model architecture. |
| - `neural_mathrock_labels.pkl`: Pickle file containing label mappings for decoding predictions. |
|
|
| ## Usage |
| Preprocessing involves converting raw audio to Mel-Spectrograms at a sample rate of 22050 Hz, normalized to a 128x128 resolution. Use the provided pickle file to map the integer outputs to their respective categorical strings. |