| --- |
| license: mit |
| pipeline_tag: audio-to-audio |
| --- |
| |
| ## MusicMaker - Transformer Model for Music Generation |
|
|
| #### Overview: |
|
|
| MusicMaker is a transformer-based model trained to generate novel musical compositions in the MIDI format. By learning from a dataset of piano MIDI files, the model can capture the intricate patterns and structures present in music and generate coherent and creative melodies. |
|
|
| #### Key Features: |
|
|
| - Generation of novel musical compositions in MIDI format |
| - Trained on a dataset of piano MIDI files |
| - Based on transformer architecture for capturing long-range dependencies |
| - Tokenizer trained specifically on MIDI data using miditok library |
|
|
| #### Training Data: |
|
|
| The model was trained on a dataset of ~11,000 piano MIDI files from the "adl-piano-midi" collection. |
|
|
| #### Model Details: |
|
|
| - Architecture: GPT-style transformer |
| - Number of layers: 12 |
| - Hidden size: 512 |
| - Attention heads: 8 |
| - Tokenizer vocabulary size: 12,000 |
|
|
| #### Usage: |
|
|
| ```py |
| |
| from transformers import AutoModel |
| from miditok import MusicTokenizer |
| import torch |
| |
| device = 'cuda' if torch.cuda.is_available() else 'cpu' |
| |
| tokenizer = MusicTokenizer.from_pretrained('shikhr/music_maker') |
| |
| model = AutoModel.from_pretrained('shikhr/music_maker', trust_remote_code=True) |
| model.to(device) |
| |
| # Generate some music |
| out = model.generate( |
| torch.tensor([[1]]).to(device), max_new_tokens=400, temperature=1.0, top_k=None |
| ) |
| |
| # Save the generated MIDI |
| tokenizer(out[0].tolist()).dump_midi("generated.mid") |
| |
| ``` |
|
|
| #### Limitations and Bias: |
|
|
| - The model has only been trained on piano MIDI data, so its ability to generalize to other instruments may be limited. |
| - The generated music may exhibit some repetitive or unnatural patterns. |
| - The training data itself may contain certain biases or patterns reflective of its sources. |