drbaph
/

AudioSR

super-resolution

audio-upscaling

versatle-audio-super-resolution

Model card Files Files and versions

drbaph commited on Jan 8

Commit

8769e81

·

verified ·

1 Parent(s): e26978a

Create README.md

Files changed (1) hide show

README.md +82 -0

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+license: mit
+tags:
+- audio
+- super-resolution
+- audio-upscaling
+- comfyui
+- audio-sr
+- audiosr
+- versatle-audio-super-resolution
+library_name: diffusers
+---
+# AudioSR Models for ComfyUI
+Pre-trained AudioSR (Versatile Audio Super Resolution) models for use with [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-VASR) custom node.
+## Models
+### audiosr_basic_fp32.safetensors
+- **Purpose:** General audio super-resolution
+- **Best for:** Music, sound effects, podcasts, mixed content
+- **Format:** FP32 SafeTensors
+- **Size:** ~5.9 GB
+### audiosr_speech_fp32.safetensors
+- **Purpose:** Speech/voice optimized super-resolution
+- **Best for:** Voice recordings, vocals, speech content
+- **Format:** FP32 SafeTensors
+- **Size:** ~5.9 GB
+## Usage
+### Installation
+1. Install [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-VASR) via ComfyUI Manager
+2. Download model(s) from this repository
+3. Place in `ComfyUI/models/AudioSR/`
+### Quick Start
+```
+ComfyUI Workflow:
+Load Audio → AudioSR → Preview/Save Audio
+```
+**Recommended Settings:**
+- Steps: 50-100
+- Guidance Scale: 3.5-5.0
+- Model: Use `audiosr_speech_fp32.safetensors` for voice, `audiosr_basic_fp32.safetensors` for everything else
+## What it does
+AudioSR upscales low-quality audio to high-quality 48kHz output using latent diffusion. It:
+- Resamples to 48kHz
+- Enhances high frequencies
+- Reduces compression artifacts
+- Adds clarity and detail
+![ComfyUI_temp_bildo_00002_](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/ZMK6nkhj26kbLgRwJZqYp.png)
+## Model Info
+Based on [AudioSR: Versatile Audio Super-Resolution](https://arxiv.org/abs/2309.07314) by Haohe Liu et al.
+Original repository: https://github.com/haoheliu/versatile_audio_super_resolution
+**License:** MIT
+## Hardware Requirements
+- **GPU:** NVIDIA RTX 3060 or higher (6GB+ VRAM minimum)
+- **RAM:** 12GB+ recommended
+- Works best with audio > 8kHz input sample rate
+## Credits
+- **Research:** [Haohe Liu](https://github.com/haoheliu) et al.
+- **Paper:** [AudioSR on arXiv](https://arxiv.org/abs/2309.07314)
+- **ComfyUI Integration:** [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-VASR)