Spaces:
Runtime error
Runtime error
| # YourMT3+ Enhanced Music Transcription | |
| This is an enhanced version of YourMT3+ with **instrument conditioning** capabilities to solve instrument switching mid-track issues. | |
| ## Features | |
| - **Instrument Conditioning**: Choose your target instrument to maintain consistency throughout transcription | |
| - **Multi-track Support**: Transcribe multiple instruments from polyphonic audio | |
| - **Format Options**: Output as MIDI, MusicXML, ABC notation, or audio | |
| - **Free CPU Inference**: Optimized to run on HuggingFace Spaces free tier (CPU-only, 16GB RAM) | |
| ## How to Use | |
| 1. **Upload Your Audio**: Drag and drop or select an audio file | |
| 2. **Select Target Instrument**: Choose from the dropdown (vocals, piano, guitar, drums, etc.) | |
| 3. **Choose Output Format**: MIDI, MusicXML, ABC, or audio | |
| 4. **Transcribe**: Click the transcribe button and wait for results | |
| ## Instrument Conditioning System | |
| This enhanced version addresses the common issue where YourMT3+ switches instruments mid-track (e.g., vocals → violin → guitar). The system uses: | |
| - **Task Tokens**: Special conditioning tokens when available in the model | |
| - **Post-processing Filtering**: Consistent instrument filtering based on MIDI program numbers | |
| - **Debug Output**: Console logs showing instrument detection and filtering results | |
| ## Supported Instruments | |
| - Vocals/Singing | |
| - Piano | |
| - Guitar (Electric/Acoustic) | |
| - Bass | |
| - Drums | |
| - Violin | |
| - Trumpet | |
| - Saxophone | |
| - And many more... | |
| ## Technical Details | |
| - **Model**: YourMT3+ (Multi-channel T5 decoder with Perceiver-TF encoder) | |
| - **Framework**: PyTorch Lightning + Gradio | |
| - **Inference**: CPU-only for free tier compatibility | |
| - **Memory**: Optimized for 16GB RAM constraint | |
| ## Credits | |
| Based on the original YourMT3 by the MT3 team, enhanced with instrument conditioning capabilities. | |