| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - music |
| - folk-music |
| - irish-traditional-music |
| - abc-notation |
| - symbolic-music |
| - representation-learning |
| - self-supervised |
| - transformer |
| pipeline_tag: feature-extraction |
| --- |
| |
| # ABC2Vec: Self-Supervised Representation Learning for Irish Folk Music |
|
|
| This is the official pre-trained ABC2Vec model from the paper: |
|
|
| **"ABC2Vec: Self-Supervised Representation Learning for Irish Folk Music"** |
|
|
| ## Model Description |
|
|
| ABC2Vec is a self-supervised Transformer encoder that learns dense, semantically meaningful embeddings from ABC notation (symbolic music format). It is specifically designed for Irish traditional folk music and trained on 211,524 tunes. |
|
|
| ### Key Features |
|
|
| - 🎵 **Purpose-built for folk music** - Addresses transposition equivalence, modal tonality, and variant detection |
| - 🔄 **Transposition Invariance** - Novel TI objective for pitch-invariant representations |
| - 📊 **Bar-level Patchification** - 16× sequence length reduction for efficiency |
| - 🎯 **Self-supervised** - No text annotations or audio required |
| - ⚡ **Efficient** - Trained in 18 hours on Apple M4 Mac |
|
|
| ## Model Architecture |
|
|
| - **Layers:** 6 |
| - **Hidden Size (d_model):** 256 |
| - **Attention Heads:** 8 |
| - **FFN Size (d_ff):** 1024 |
| - **Embedding Size:** 128 |
| - **Vocabulary Size:** 98 |
| - **Max Bars:** 64 |
| - **Max Bar Length:** 64 |
| - **Parameters:** ~5M |
|
|
| ## Training Details |
|
|
| - **Dataset:** 211,524 Irish traditional tunes (IrishMAN corpus) |
| - **Training Objectives:** |
| - Masked Music Modeling (MMM) |
| - Transposition Invariance (TI) contrastive learning |
| - **Training Steps:** 40,000 steps (40 epochs) |
| - **Final Validation Loss:** 2.36 |
| - **Hardware:** Apple M4 Mac (48GB unified memory) |
| - **Training Time:** ~18 hours |
|
|
| ## Performance |
|
|
| | Task | Accuracy | Notes | |
| |------|----------|-------| |
| | Tune Type Classification | 78.4% ± 1.2% | 6 classes (jig, reel, polka, etc.) | |
| | Mode Classification | 78.8% ± 1.6% | 4 classes (major, minor, dorian, mixolydian) | |
| | Key Root (Linear Probe) | 62.3% ± 0.9% | 8 most common keys | |
| | Tune Length (Linear Probe) | 89.5% ± 0.7% | 3 classes (short, medium, long) | |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| import json |
| from pathlib import Path |
| |
| # Load model configuration |
| config_path = "model_config.json" |
| with open(config_path) as f: |
| config_dict = json.load(f) |
| |
| # Initialize model (you'll need the ABC2Vec model code) |
| from abc2vec.core.model import ABC2VecModel |
| from abc2vec.core.model.encoder import ABC2VecConfig |
| |
| config = ABC2VecConfig(**config_dict) |
| model = ABC2VecModel(config) |
| |
| # Load pre-trained weights |
| checkpoint = torch.load("best_model.pt", map_location="cpu") |
| model.load_state_dict(checkpoint["model_state_dict"]) |
| model.eval() |
| |
| # Load vocabulary for tokenization |
| with open("vocab.json") as f: |
| vocab_data = json.load(f) |
| |
| # Extract embeddings for a tune |
| from abc2vec.core.tokenizer import ABCVocabulary, BarPatchifier |
| |
| vocab = ABCVocabulary.load("vocab.json") |
| patchifier = BarPatchifier( |
| vocab=vocab, |
| max_bars=config.max_bars, |
| max_bar_length=config.max_bar_length |
| ) |
| |
| # Example ABC tune |
| abc_tune = "M:6/8\nK:D\n|:A2A ABc|ded cBA|A2A ABc|ded cAG|" |
| patches = patchifier.patchify(abc_tune) |
| |
| # Get embedding |
| with torch.no_grad(): |
| bar_indices = patches["bar_indices"].unsqueeze(0) |
| char_mask = patches["char_mask"].unsqueeze(0) |
| bar_mask = patches["bar_mask"].unsqueeze(0) |
| |
| embedding = model.get_embedding(bar_indices, char_mask, bar_mask) |
| # embedding shape: (1, 128) |
| ``` |
|
|
| ## Code Repository |
|
|
| Full training code, evaluation scripts, and usage examples: |
| - **GitHub:** https://github.com/pianistprogrammer/ABC2VEC |
|
|
| ## Dataset |
|
|
| The processed dataset with train/validation/test splits: |
| - **HuggingFace:** https://huggingface.co/datasets/pianistprogrammer/abc2vec-irish-folk-dataset |
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| ```bibtex |
| @article{abc2vec2025, |
| title={ABC2Vec: Self-Supervised Representation Learning for Irish Folk Music}, |
| author={[Your Name]}, |
| journal={[Journal Name]}, |
| year={2025}, |
| note={Model: https://huggingface.co/pianistprogrammer/abc2vec-model} |
| } |
| ``` |
|
|
| ## License |
|
|
| MIT License |
|
|
| ## Acknowledgements |
|
|
| We thank The Session community for curating and maintaining the Irish traditional music archive that made this work possible. |
|
|
| ## Model Card Authors |
|
|
| [Your Name] |
|
|
| ## Contact |
|
|
| For questions or issues: |
| - GitHub: https://github.com/pianistprogrammer/ABC2VEC |
| - HuggingFace: https://huggingface.co/pianistprogrammer |
|
|