metadata
license: mit
tags:
- audio
- super-resolution
- audio-upscaling
- comfyui
- audio-sr
- audiosr
- versatle-audio-super-resolution
library_name: diffusers
AudioSR Models for ComfyUI
Pre-trained AudioSR (Versatile Audio Super Resolution) models for use with ComfyUI-AudioSR custom node.
Models
audiosr_basic_fp32.safetensors
- Purpose: General audio super-resolution
- Best for: Music, sound effects, podcasts, mixed content
- Format: FP32 SafeTensors
- Size: ~5.9 GB
audiosr_speech_fp32.safetensors
- Purpose: Speech/voice optimized super-resolution
- Best for: Voice recordings, vocals, speech content
- Format: FP32 SafeTensors
- Size: ~5.9 GB
Usage
Installation
- Install ComfyUI-AudioSR via ComfyUI Manager
- Download model(s) from this repository
- Place in
ComfyUI/models/AudioSR/
Quick Start
ComfyUI Workflow:
Load Audio → AudioSR → Preview/Save Audio
Recommended Settings:
- Steps: 50-100
- Guidance Scale: 3.5-5.0
- Model: Use
audiosr_speech_fp32.safetensorsfor voice,audiosr_basic_fp32.safetensorsfor everything else
What it does
AudioSR upscales low-quality audio to high-quality 48kHz output using latent diffusion. It:
- Resamples to 48kHz
- Enhances high frequencies
- Reduces compression artifacts
- Adds clarity and detail
Model Info
Based on AudioSR: Versatile Audio Super-Resolution by Haohe Liu et al.
Original repository: https://github.com/haoheliu/versatile_audio_super_resolution
License: MIT
Hardware Requirements
- GPU: NVIDIA RTX 3060 or higher (6GB+ VRAM minimum)
- RAM: 12GB+ recommended
- Works best with audio > 8kHz input sample rate
Credits
- Research: Haohe Liu et al.
- Paper: AudioSR on arXiv
- ComfyUI Integration: ComfyUI-AudioSR
