SuperPauly
/

SuperPauly drbaph commited on
Commit
7cd9ba1
·
0 Parent(s):

Duplicate from drbaph/AudioSR

Browse files

Co-authored-by: DRBAPH <drbaph@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ samples/event_audiosr_2.wav filter=lfs diff=lfs merge=lfs -text
37
+ samples/event_up_2.wav filter=lfs diff=lfs merge=lfs -text
38
+ samples/speech_audiosr_4.wav filter=lfs diff=lfs merge=lfs -text
39
+ samples/speech_up_4.wav filter=lfs diff=lfs merge=lfs -text
AudioSR/audiosr_basic_fp32.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:139db138159955434155c08f388ecdcef5827181d14ef8b8d63eed57f1cecacf
3
+ size 6177350576
AudioSR/audiosr_speech_fp32.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:80e0ad005ef8f6acdff512bd23f1590ab525bbd0419929d405d595426741801f
3
+ size 6177350576
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - audio
5
+ - super-resolution
6
+ - audio-upscaling
7
+ - comfyui
8
+ - audio-sr
9
+ - audiosr
10
+ - versatle-audio-super-resolution
11
+ library_name: diffusers
12
+ pipeline_tag: audio-to-audio
13
+ ---
14
+
15
+ # AudioSR Models for ComfyUI
16
+
17
+ Pre-trained AudioSR (Versatile Audio Super Resolution) models for use with [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-AudioSR) custom node.
18
+
19
+ <audio controls src="https://huggingface.co/drbaph/AudioSR/resolve/main/samples/speech_up_4.wav"></audio>
20
+ <audio controls src="https://huggingface.co/drbaph/AudioSR/resolve/main/samples/speech_audiosr_4.wav"></audio>
21
+
22
+ ![ComfyUI_temp_bildo_00002_](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/ZMK6nkhj26kbLgRwJZqYp.png)
23
+
24
+
25
+ ## Models
26
+
27
+ ### audiosr_basic_fp32.safetensors
28
+ - **Purpose:** General audio super-resolution
29
+ - **Best for:** Music, sound effects, podcasts, mixed content
30
+ - **Format:** FP32 SafeTensors
31
+ - **Size:** ~5.9 GB
32
+
33
+ ### audiosr_speech_fp32.safetensors
34
+ - **Purpose:** Speech/voice optimized super-resolution
35
+ - **Best for:** Voice recordings, vocals, speech content
36
+ - **Format:** FP32 SafeTensors
37
+ - **Size:** ~5.9 GB
38
+
39
+ ## Usage
40
+
41
+ ### Installation
42
+
43
+ 1. Install [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-AudioSR) via ComfyUI Manager
44
+ 2. Download model(s) from this repository
45
+ 3. Place in `ComfyUI/models/AudioSR/`
46
+
47
+ ### Quick Start
48
+
49
+ ```
50
+ ComfyUI Workflow:
51
+ Load Audio → AudioSR → Preview/Save Audio
52
+ ```
53
+
54
+ **Recommended Settings:**
55
+ - Steps: 50-100
56
+ - Guidance Scale: 3.5-5.0
57
+ - Model: Use `audiosr_speech_fp32.safetensors` for voice, `audiosr_basic_fp32.safetensors` for everything else
58
+
59
+ ## What it does
60
+
61
+ AudioSR upscales low-quality audio to high-quality 48kHz output using latent diffusion. It:
62
+
63
+ - Resamples to 48kHz
64
+ - Enhances high frequencies
65
+ - Reduces compression artifacts
66
+ - Adds clarity and detail
67
+
68
+ ## Model Info
69
+
70
+ Based on [AudioSR: Versatile Audio Super-Resolution](https://arxiv.org/abs/2309.07314) by Haohe Liu et al.
71
+
72
+ Original repository: https://github.com/haoheliu/versatile_audio_super_resolution
73
+
74
+ **License:** MIT
75
+
76
+ ## Hardware Requirements
77
+
78
+ - **GPU:** NVIDIA RTX 3060 or higher (6GB+ VRAM minimum)
79
+ - **RAM:** 12GB+ recommended
80
+ - Works best with audio > 8kHz input sample rate
81
+
82
+ ## Credits
83
+
84
+ - **Research:** [Haohe Liu](https://github.com/haoheliu) et al.
85
+ - **Paper:** [AudioSR on arXiv](https://arxiv.org/abs/2309.07314)
86
+ - **ComfyUI Integration:** [ComfyUI-AudioSR](https://github.com/Saganaki22/ComfyUI-AudioSR)
samples/event_audiosr_2.wav ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4a6c12b4d142161f110e47f2a9cd443ad7421c38c641fa30689f62246eaecdf
3
+ size 496610
samples/event_up_2.wav ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d861f93944f126237404b54f13cf2955daebc82bb1524320397d4ef18222dc5
3
+ size 491564
samples/speech_audiosr_4.wav ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63dff9877838fac135c41919b90d36653f310e0a71658b8579e58d518345a141
3
+ size 491564
samples/speech_up_4.wav ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d8a06dc9af11ff3671d8132f2fb5579244f37f5ba111f206d2f16de5d60d042
3
+ size 491564