Spaces:

asdd12e2ad
/

yourmt3

Runtime error

App Files Files Community

yourmt3 / README_SPACES.md

asdd12e2ad

asd

c207bc4 7 months ago

preview code

raw

history blame contribute delete

1.79 kB

	# YourMT3+ Enhanced Music Transcription

	This is an enhanced version of YourMT3+ with instrument conditioning capabilities to solve instrument switching mid-track issues.

	## Features

	- Instrument Conditioning: Choose your target instrument to maintain consistency throughout transcription
	- Multi-track Support: Transcribe multiple instruments from polyphonic audio
	- Format Options: Output as MIDI, MusicXML, ABC notation, or audio
	- Free CPU Inference: Optimized to run on HuggingFace Spaces free tier (CPU-only, 16GB RAM)

	## How to Use

	1. Upload Your Audio: Drag and drop or select an audio file
	2. Select Target Instrument: Choose from the dropdown (vocals, piano, guitar, drums, etc.)
	3. Choose Output Format: MIDI, MusicXML, ABC, or audio
	4. Transcribe: Click the transcribe button and wait for results

	## Instrument Conditioning System

	This enhanced version addresses the common issue where YourMT3+ switches instruments mid-track (e.g., vocals → violin → guitar). The system uses:

	- Task Tokens: Special conditioning tokens when available in the model
	- Post-processing Filtering: Consistent instrument filtering based on MIDI program numbers
	- Debug Output: Console logs showing instrument detection and filtering results

	## Supported Instruments

	- Vocals/Singing
	- Piano
	- Guitar (Electric/Acoustic)
	- Bass
	- Drums
	- Violin
	- Trumpet
	- Saxophone
	- And many more...

	## Technical Details

	- Model: YourMT3+ (Multi-channel T5 decoder with Perceiver-TF encoder)
	- Framework: PyTorch Lightning + Gradio
	- Inference: CPU-only for free tier compatibility
	- Memory: Optimized for 16GB RAM constraint

	## Credits

	Based on the original YourMT3 by the MT3 team, enhanced with instrument conditioning capabilities.