| --- |
| license: mit |
| tags: |
| - audio |
| - super-resolution |
| - audio-upscaling |
| - comfyui |
| - audio-sr |
| - audiosr |
| - versatle-audio-super-resolution |
| library_name: diffusers |
| pipeline_tag: audio-to-audio |
| --- |
| |
| # AudioSR Models for ComfyUI |
|
|
| Pre-trained AudioSR (Versatile Audio Super Resolution) models for use with [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-AudioSR) custom node. |
|
|
| <audio controls src="https://huggingface.co/drbaph/AudioSR/resolve/main/samples/speech_up_4.wav"></audio> |
| <audio controls src="https://huggingface.co/drbaph/AudioSR/resolve/main/samples/speech_audiosr_4.wav"></audio> |
|
|
|  |
|
|
|
|
| ## Models |
|
|
| ### audiosr_basic_fp32.safetensors |
| - **Purpose:** General audio super-resolution |
| - **Best for:** Music, sound effects, podcasts, mixed content |
| - **Format:** FP32 SafeTensors |
| - **Size:** ~5.9 GB |
|
|
| ### audiosr_speech_fp32.safetensors |
| - **Purpose:** Speech/voice optimized super-resolution |
| - **Best for:** Voice recordings, vocals, speech content |
| - **Format:** FP32 SafeTensors |
| - **Size:** ~5.9 GB |
|
|
| ## Usage |
|
|
| ### Installation |
|
|
| 1. Install [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-AudioSR) via ComfyUI Manager |
| 2. Download model(s) from this repository |
| 3. Place in `ComfyUI/models/AudioSR/` |
|
|
| ### Quick Start |
|
|
| ``` |
| ComfyUI Workflow: |
| Load Audio → AudioSR → Preview/Save Audio |
| ``` |
|
|
| **Recommended Settings:** |
| - Steps: 50-100 |
| - Guidance Scale: 3.5-5.0 |
| - Model: Use `audiosr_speech_fp32.safetensors` for voice, `audiosr_basic_fp32.safetensors` for everything else |
|
|
| ## What it does |
|
|
| AudioSR upscales low-quality audio to high-quality 48kHz output using latent diffusion. It: |
|
|
| - Resamples to 48kHz |
| - Enhances high frequencies |
| - Reduces compression artifacts |
| - Adds clarity and detail |
|
|
| ## Model Info |
|
|
| Based on [AudioSR: Versatile Audio Super-Resolution](https://arxiv.org/abs/2309.07314) by Haohe Liu et al. |
|
|
| Original repository: https://github.com/haoheliu/versatile_audio_super_resolution |
| |
| **License:** MIT |
| |
| ## Hardware Requirements |
| |
| - **GPU:** NVIDIA RTX 3060 or higher (6GB+ VRAM minimum) |
| - **RAM:** 12GB+ recommended |
| - Works best with audio > 8kHz input sample rate |
| |
| ## Credits |
| |
| - **Research:** [Haohe Liu](https://github.com/haoheliu) et al. |
| - **Paper:** [AudioSR on arXiv](https://arxiv.org/abs/2309.07314) |
| - **ComfyUI Integration:** [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-AudioSR) |