NovaSR / README.md
drbaph's picture
Update README.md
10cc928 verified
metadata
license: apache-2.0
pipeline_tag: audio-to-audio
tags:
  - comfy
  - comfyui
  - audio2audio
  - audio-to-audio
  - audio-upscale
  - audiosuperresolution

Custom Node for ComfyUI

https://github.com/Saganaki22/ComfyUI-NovaSR

image

NovaSR: Pushing the Limits of Extreme Efficiency in Audio Super-Resolution

This is the model for NovaSR, a tiny 50kb audio upsampling model that upscales muffled 16khz audio into clear and crisp 48khz audio at speeds from 100-3500x realtime.

Audio Samples

Before Processing (16kHz):

After Processing (48kHz):

ComfyUI_temp_tepqs_00001_

Details

  • Model Size: 52kb for pytorch version
  • Input Rate: 16kHz
  • Output Rate: 48kHz
  • Inference Speed: 300-3500x realtime depending on gpu
  • Mono

Comparisons

Comparisons were done on A100 gpu. Higher realtime means faster processing speeds. Comparison on CPU are coming soon.

Model Speed (Real-Time) Model Size
NovaSR 3600x realtime ~52 KB
FlowHigh 20x realtime ~450 MB
FlashSR 14x realtime ~1000 MB
AudioSR 0.6x realtime ~6000 MB

Usage

Please check out the github repo for usage: https://github.com/Saganaki22/ComfyUI-NovaSR

Original Repo: https://github.com/ysharma3501/NovaSR

If you find the model/code helpful, stars or likes would be appreciated.

Thank you.