anggars commited on
Commit
d6323d0
·
verified ·
1 Parent(s): ff796f6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - audio-classification
6
+ - math-rock
7
+ - midwest-emo
8
+ - mbti
9
+ - tensorflow
10
+ - keras
11
+ metrics:
12
+ - accuracy
13
+ datasets:
14
+ - anggars/neural-mathrock
15
+ ---
16
+
17
+ # Neural Mathrock Classifier (v1.0)
18
+
19
+ This model is a multi-output Custom Convolutional Neural Network (CNN) designed to analyze audio characteristics specifically within the Math Rock and Midwest Emo genres.
20
+
21
+ ## Model Outputs
22
+ The architecture consists of five dedicated output heads, providing simultaneous classification for:
23
+ 1. **MBTI**: Personality type association based on musical patterns.
24
+ 2. **Emotion**: Emotional state detection (e.g., Fear, Sadness, Happiness).
25
+ 3. **Audio Vibe**: General atmosphere and sonic texture.
26
+ 4. **Intensity**: Aggression and energy levels.
27
+ 5. **Tempo**: Rhythmic speed classification (Slow, Medium, Fast).
28
+
29
+ ## Technical Specifications
30
+ - **Input Shape**: 128x128x3 (Mel-Spectrograms)
31
+ - **Framework**: TensorFlow 2.x / Keras 3
32
+ - **Architecture**: Sequential CNN with Batch Normalization, Global Average Pooling, and Dropout layers for regularization.
33
+ - **Optimization**: Adam Optimizer with Sparse Categorical Crossentropy loss for all heads.
34
+
35
+ ## Accuracy Performance
36
+ Based on the final training logs (Epoch 20):
37
+ - **Intensity/Tempo**: ~75-77%
38
+ - **MBTI/Emotion**: ~55-63% (Outperforming baseline random classification for 16-class MBTI).
39
+
40
+ ## Files
41
+ - `neural_mathrock_model.keras`: Trained weights and model architecture.
42
+ - `neural_mathrock_labels.pkl`: Pickle file containing label mappings for decoding predictions.
43
+
44
+ ## Usage
45
+ Preprocessing involves converting raw audio to Mel-Spectrograms at a sample rate of 22050 Hz, normalized to a 128x128 resolution. Use the provided pickle file to map the integer outputs to their respective categorical strings.