anggars
/

neural-mathrock

Audio Classification

Model card Files Files and versions

neural-mathrock / README.md

anggars's picture

Create README.md

d6323d0 verified 26 days ago

|

history blame contribute delete

1.76 kB

	---
	language: en
	license: mit
	tags:
	- audio-classification
	- math-rock
	- midwest-emo
	- mbti
	- tensorflow
	- keras
	metrics:
	- accuracy
	datasets:
	- anggars/neural-mathrock
	---

	# Neural Mathrock Classifier (v1.0)

	This model is a multi-output Custom Convolutional Neural Network (CNN) designed to analyze audio characteristics specifically within the Math Rock and Midwest Emo genres.

	## Model Outputs
	The architecture consists of five dedicated output heads, providing simultaneous classification for:
	1. MBTI: Personality type association based on musical patterns.
	2. Emotion: Emotional state detection (e.g., Fear, Sadness, Happiness).
	3. Audio Vibe: General atmosphere and sonic texture.
	4. Intensity: Aggression and energy levels.
	5. Tempo: Rhythmic speed classification (Slow, Medium, Fast).

	## Technical Specifications
	- Input Shape: 128x128x3 (Mel-Spectrograms)
	- Framework: TensorFlow 2.x / Keras 3
	- Architecture: Sequential CNN with Batch Normalization, Global Average Pooling, and Dropout layers for regularization.
	- Optimization: Adam Optimizer with Sparse Categorical Crossentropy loss for all heads.

	## Accuracy Performance
	Based on the final training logs (Epoch 20):
	- Intensity/Tempo: ~75-77%
	- MBTI/Emotion: ~55-63% (Outperforming baseline random classification for 16-class MBTI).

	## Files
	- `neural_mathrock_model.keras`: Trained weights and model architecture.
	- `neural_mathrock_labels.pkl`: Pickle file containing label mappings for decoding predictions.

	## Usage
	Preprocessing involves converting raw audio to Mel-Spectrograms at a sample rate of 22050 Hz, normalized to a 128x128 resolution. Use the provided pickle file to map the integer outputs to their respective categorical strings.