TiMauzi
/

EraClassifierBiLSTM-4.76M

@@ -2,38 +2,119 @@
 library_name: transformers
 tags:
 - generated_from_trainer
 datasets:
-- generator
 metrics:
 - accuracy
 - f1
 model-index:
-- name: EraClassifierBiLSTM
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# EraClassifierBiLSTM
-This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
-It achieves the following results on the evaluation set:
 - Loss: 1.0935
 - Accuracy: 0.5852
 - F1: 0.4299
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -102,6 +183,8 @@ The following hyperparameters were used during training:
 | 0.8316        | 4.8476 | 94000 | 1.1055          | 0.5736   | 0.4174 |
 | 0.8264        | 4.9508 | 96000 | 1.1056          | 0.5736   | 0.4174 |
 ### Framework versions

 library_name: transformers
 tags:
 - generated_from_trainer
+- midi
+- music
+- era-classification
+- bilstm
+- audio-analysis
 datasets:
+- TiMauzi/imslp-midi-by-sa
 metrics:
 - accuracy
 - f1
 model-index:
+- name: EraClassifierBiLSTM-4.76M
   results: []
 ---
+# EraClassifierBiLSTM-4.76M
+This model is a compact bidirectional LSTM neural network designed for musical era classification from MIDI data. It achieves the following results on the evaluation set:
 - Loss: 1.0935
 - Accuracy: 0.5852
 - F1: 0.4299
 ## Model description
+The EraClassifierBiLSTM-4.76M is a custom bidirectional LSTM neural network specifically designed for classifying musical compositions into historical eras based on MIDI data analysis. This compact model variant (~4.76M parameters) offers a good balance between performance and computational efficiency.
+### Architecture
+- **Model Type**: Custom Bidirectional LSTM (BiLSTM)
+- **Input**: Sequences of 8-dimensional feature vectors extracted from MIDI messages
+- **Window Size**: 24 MIDI messages per sequence with stride=20 (overlapping windows)
+- **Hidden Layers**: 2 bidirectional LSTM layers with 384 hidden units each
+- **Output**: 6-class classification (musical eras)
+- **Activation**: LeakyReLU with dropout for regularization
+- **Loss Function**: CrossEntropyLoss
+### Feature Engineering
+The model processes 8 key MIDI features per message, automatically selected as the most frequent features across the dataset:
+**Numerical Features (7):**
+- **channel**: MIDI channel number (μ=2.01, σ=2.74)
+- **control**: Control change values (μ=11.90, σ=17.02)
+- **note**: Note pitch/midi note number (μ=64.17, σ=12.00)
+- **tempo**: Tempo in microseconds per beat (μ=738221.63, σ=460369.34)
+- **time**: Timing information in ticks (μ=714.28, σ=1337451.38)
+- **value**: Generic value field (μ=83.91, σ=26.72)
+- **velocity**: Note velocity/intensity (μ=42.80, σ=44.24)
+**Categorical Features (1):**
+- **type**: MIDI message type (mapped to numerical IDs)
+All numerical features are normalized using dataset statistics (mean and standard deviation), while categorical features are encoded using learned ID mappings.
+### Training Approach
+The model uses a sliding window approach to capture temporal patterns in musical structure that are characteristic of different historical periods. Each MIDI file is processed into multiple overlapping sequences, allowing the model to learn both local and global musical patterns.
 ## Intended uses & limitations
+### Intended Uses
+- **Musicological Research**: Analyzing historical trends in musical composition
+- **Educational Tools**: Teaching music history through automated era identification
+- **Digital Music Libraries**: Automatic categorization and organization of MIDI collections
+- **Music Analysis**: Understanding stylistic characteristics across different periods
+- **Content Recommendation**: Suggesting music from similar historical periods
+### Limitations
+- **Performance Variability**: The model shows significant performance differences across eras:
+  - Strong performance on Romantic (82.6%) and Baroque (66.6%) eras
+  - Moderate performance on Renaissance (45.4%) and Modern (37.0%) eras
+  - Poor performance on Classical (12.5%) and Other (14.2%) categories
+- **Era Confusion**: Adjacent historical periods are frequently confused:
+  - Renaissance music often misclassified as Baroque (36.7%)
+  - Classical music heavily confused with Baroque (37.7%) and Romantic (34.1%)
+  - Modern music often misclassified as Romantic (35.9%)
+- **Data Dependencies**: Performance depends on the quality and representativeness of the training data
+- **MIDI-Only**: Limited to MIDI format; cannot process audio recordings or sheet music
+- **Cultural Bias**: Training data may reflect Western classical music traditions
+### Recommendations for Use
+- Validate results with musicological expertise, especially for Classical period identification
+- Use confidence thresholds to filter low-confidence predictions
 ## Training and evaluation data
+### Dataset
+- **Source**: TiMauzi/imslp-midi-by-sa (International Music Score Library Project)
+- **Format**: MIDI files with associated metadata including composition year and era
+- **Preprocessing**: MIDI messages converted to 8-dimensional feature vectors
+- **Window Strategy**: 24-message windows with 20-message stride for overlapping sequences
+### Musical Eras Covered
+1. **Renaissance** (1400-1600): Early polyphonic music, madrigals, motets
+2. **Baroque** (1600-1750): Ornamented music, basso continuo, fugues
+3. **Classical** (1750-1820): Clear forms, balanced phrases, sonata form
+4. **Romantic** (1820-1900): Expressive, emotional, expanded forms
+5. **Modern** (1900-present): Atonal, experimental, diverse styles
+6. **Other**: Miscellaneous or unclear period classifications
+### Data Distribution
+The model was trained on 6,992 MIDI files from the IMSLP dataset with the following era distribution:
+- **Romantic**: 2,722 samples (38.9%) - median year 1854
+- **Baroque**: 1,874 samples (26.8%) - median year 1710
+- **Renaissance**: 843 samples (12.1%) - median year 1611
+- **Modern**: 763 samples (10.9%) - median year 2020
+- **Classical**: 597 samples (8.5%) - median year 1779
+- **Other**: 193 samples (2.8%) - median year 1909 (includes Early 20th century and Medieval)
+Era thresholding was applied (minimum 150 samples per era), with rare eras like "Early 20th century" (125 samples) and "Medieval" (5 samples) mapped to the "Other" category to maintain classification stability.
+### Evaluation Strategy
+- **Validation**: Performance measured on held-out validation set
+- **Test Set**: Final evaluation on completely unseen test data
+- **Metrics**: Accuracy, F1-score (macro-averaged), and confusion matrix analysis
+- **Training Duration**: 5 epochs (~96,000 training steps) with fallback to best result (early stopping) based on F1 score
 ## Training procedure
 | 0.8316        | 4.8476 | 94000 | 1.1055          | 0.5736   | 0.4174 |
 | 0.8264        | 4.9508 | 96000 | 1.1056          | 0.5736   | 0.4174 |
+### Training Analysis
+The training shows stable convergence with the model reaching its best performance around step 44,000 (epoch 2.27). The training loss decreases steadily while validation metrics stabilize, indicating good generalization without severe overfitting. The model achieves its peak F1 score of 0.4299 at step 44,000, which was selected as the best checkpoint.
 ### Framework versions