On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks
Input
Audio file (.wav file)
input.wav is (/Test/003 - Actions - One Minute Smile/mixture.wav) in DSD100 dataset. (can be donwloaded from http://liutkus.net/DSD100.zip)
To reduce calculation cost, input.wav is clipped from original.
Output
Bandwidth extented audio file (.wav file)
Usage
Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.
For the sample wav,
$ python3 deep_music_enhancer.py
Supported model types are [resnet, resnet_bn, resnet_da, resnet_do, unet, unet_bn, unet_da, unet_do].
bn means batch normlization, do means dropout, da means data augmentation.
Model type can be specified as below.
$ python3 deep_music_enhancer.py --model [MODEL TYPE]
You can specify input audio files by adding --input option.
$ python3 deep_music_enhancer.py --input [INPUT WAV FILE]
If you save audio output with specified name, you have to add --savefile option.
$ python3 deep_music_enhancer.py --savepath [OUTPUT NAME]
Additionaly, you can use --vis option in order to visualize spectrogram of input and output audio.
Spectrogram of output audio (butter filter)
Spectrogram of output audio (cheby1 filter)
Reference
Framework
Pytorch
Model Format
ONNX opset=11
