NovaSR / README.md
Meyer54's picture
Update README.md
03c2a4e verified
|
raw
history blame
2.1 kB
metadata
license: apache-2.0
pipeline_tag: audio-to-audio

NovaSR: Pushing the Limits of Extreme Efficiency in Audio Super-Resolution

This is the model for NovaSR, a tiny 50kB audio upsampling model that upscales muffled 16khz audio into clear and crisp 48khz audio at speeds from 100-3500x realtime.

Details

  • Model Size: 52kB for pytorch version
  • Input Rate: 16kHz
  • Output Rate: 48kHz
  • Inference Speed: 300-3500x realtime depending on gpu

Comparisons

Comparisons were done on A100 gpu. Higher realtime means faster processing speeds. Comparison on CPU are coming soon.

Model Speed (Real-Time) Model Size
NovaSR 3600x realtime ~52 KB
FlowHigh 20x realtime ~450 MB
FlashSR 14x realtime ~1000 MB
AudioSR 0.6x realtime ~2000 MB

Examples

Random 3s examples from datasets

Before:

After:

Before:

After:

Before(music):

After(music):

Usage

Please check out the github repo for usage: https://github.com/ysharma3501/NovaSR

You can also try it on spaces: https://huggingface.co/spaces/YatharthS/NovaSR

If you find the model/code helpful, stars or likes would be appreciated. Thank you.