neural-mathrock / README.md
anggars's picture
Create README.md
d6323d0 verified
metadata
language: en
license: mit
tags:
  - audio-classification
  - math-rock
  - midwest-emo
  - mbti
  - tensorflow
  - keras
metrics:
  - accuracy
datasets:
  - anggars/neural-mathrock

Neural Mathrock Classifier (v1.0)

This model is a multi-output Custom Convolutional Neural Network (CNN) designed to analyze audio characteristics specifically within the Math Rock and Midwest Emo genres.

Model Outputs

The architecture consists of five dedicated output heads, providing simultaneous classification for:

  1. MBTI: Personality type association based on musical patterns.
  2. Emotion: Emotional state detection (e.g., Fear, Sadness, Happiness).
  3. Audio Vibe: General atmosphere and sonic texture.
  4. Intensity: Aggression and energy levels.
  5. Tempo: Rhythmic speed classification (Slow, Medium, Fast).

Technical Specifications

  • Input Shape: 128x128x3 (Mel-Spectrograms)
  • Framework: TensorFlow 2.x / Keras 3
  • Architecture: Sequential CNN with Batch Normalization, Global Average Pooling, and Dropout layers for regularization.
  • Optimization: Adam Optimizer with Sparse Categorical Crossentropy loss for all heads.

Accuracy Performance

Based on the final training logs (Epoch 20):

  • Intensity/Tempo: ~75-77%
  • MBTI/Emotion: ~55-63% (Outperforming baseline random classification for 16-class MBTI).

Files

  • neural_mathrock_model.keras: Trained weights and model architecture.
  • neural_mathrock_labels.pkl: Pickle file containing label mappings for decoding predictions.

Usage

Preprocessing involves converting raw audio to Mel-Spectrograms at a sample rate of 22050 Hz, normalized to a 128x128 resolution. Use the provided pickle file to map the integer outputs to their respective categorical strings.