drbaph commited on
Commit
8769e81
·
verified ·
1 Parent(s): e26978a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - audio
5
+ - super-resolution
6
+ - audio-upscaling
7
+ - comfyui
8
+ - audio-sr
9
+ - audiosr
10
+ - versatle-audio-super-resolution
11
+ library_name: diffusers
12
+ ---
13
+
14
+ # AudioSR Models for ComfyUI
15
+
16
+ Pre-trained AudioSR (Versatile Audio Super Resolution) models for use with [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-VASR) custom node.
17
+
18
+ ## Models
19
+
20
+ ### audiosr_basic_fp32.safetensors
21
+ - **Purpose:** General audio super-resolution
22
+ - **Best for:** Music, sound effects, podcasts, mixed content
23
+ - **Format:** FP32 SafeTensors
24
+ - **Size:** ~5.9 GB
25
+
26
+ ### audiosr_speech_fp32.safetensors
27
+ - **Purpose:** Speech/voice optimized super-resolution
28
+ - **Best for:** Voice recordings, vocals, speech content
29
+ - **Format:** FP32 SafeTensors
30
+ - **Size:** ~5.9 GB
31
+
32
+ ## Usage
33
+
34
+ ### Installation
35
+
36
+ 1. Install [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-VASR) via ComfyUI Manager
37
+ 2. Download model(s) from this repository
38
+ 3. Place in `ComfyUI/models/AudioSR/`
39
+
40
+ ### Quick Start
41
+
42
+ ```
43
+ ComfyUI Workflow:
44
+ Load Audio → AudioSR → Preview/Save Audio
45
+ ```
46
+
47
+ **Recommended Settings:**
48
+ - Steps: 50-100
49
+ - Guidance Scale: 3.5-5.0
50
+ - Model: Use `audiosr_speech_fp32.safetensors` for voice, `audiosr_basic_fp32.safetensors` for everything else
51
+
52
+ ## What it does
53
+
54
+ AudioSR upscales low-quality audio to high-quality 48kHz output using latent diffusion. It:
55
+
56
+ - Resamples to 48kHz
57
+ - Enhances high frequencies
58
+ - Reduces compression artifacts
59
+ - Adds clarity and detail
60
+
61
+
62
+ ![ComfyUI_temp_bildo_00002_](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/ZMK6nkhj26kbLgRwJZqYp.png)
63
+
64
+ ## Model Info
65
+
66
+ Based on [AudioSR: Versatile Audio Super-Resolution](https://arxiv.org/abs/2309.07314) by Haohe Liu et al.
67
+
68
+ Original repository: https://github.com/haoheliu/versatile_audio_super_resolution
69
+
70
+ **License:** MIT
71
+
72
+ ## Hardware Requirements
73
+
74
+ - **GPU:** NVIDIA RTX 3060 or higher (6GB+ VRAM minimum)
75
+ - **RAM:** 12GB+ recommended
76
+ - Works best with audio > 8kHz input sample rate
77
+
78
+ ## Credits
79
+
80
+ - **Research:** [Haohe Liu](https://github.com/haoheliu) et al.
81
+ - **Paper:** [AudioSR on arXiv](https://arxiv.org/abs/2309.07314)
82
+ - **ComfyUI Integration:** [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-VASR)