earlab commited on
Commit
e79fe63
·
verified ·
1 Parent(s): b3c4dc3

Update readme for hf

Browse files
Files changed (1) hide show
  1. README.md +16 -12
README.md CHANGED
@@ -1,16 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # εar-VAE: High Fidelity Music Reconstruction Model
 
2
 
3
  This repository contains the official inference code for εar-VAE, aa 44.1 kHz music signal reconstruction model that rethinks and optimizes VAE training for audio. It targets two common weaknesses in existing open-source VAEs—phase accuracy and stereophonic spatial representation—by aligning objectives with auditory perception and introducing phase-aware training. Experiments show substantial improvements across diverse metrics, with particular strength in high-frequency harmonics and spatial characteristics.
4
 
5
- <p align="center">
6
- <img src="./images/all_compares.jpg" width=90%>
7
- <img src="./images/table.png" width=90%>
8
- </p>
9
-
10
- <p align="center">
11
- <em>Upper: Ablation study across our training components.</em> <em>Down: Cross-model metric comparison on the evaluation dataset.</em>
12
- </p>
13
-
14
  Why εar-VAE:
15
  - 🎧 Perceptual alignment: A K-weighting perceptual filter is applied before loss computation to better match human hearing.
16
  - 🔁 Phase-aware objectives: Two novel phase losses
@@ -142,6 +148,4 @@ This project builds upon the work of several open-source projects. We would like
142
  - **[Stability AI's Stable Audio Tools](https://github.com/Stability-AI/stable-audio-tools)**: For providing a foundational framework and tools for audio generation.
143
  - **[Descript's Audio Codec](https://github.com/descriptinc/descript-audio-codec)**: For the weight-normed convolusional layers
144
 
145
- Their contributions have been invaluable to the development of εar-VAE.
146
-
147
-
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - laion/LAION-DISCO-12M
5
+ language:
6
+ - en
7
+ - zh
8
+ base_model:
9
+ - stabilityai/stable-audio-open-1.0
10
+ pipeline_tag: audio-to-audio
11
+ tags:
12
+ - music
13
+ - vae
14
+ ---
15
  # εar-VAE: High Fidelity Music Reconstruction Model
16
+ [[Demo Page](https://eps-acoustic-revolution-lab.github.io/EAR_VAE/)] - [[Codes](https://github.com/Eps-Acoustic-Revolution-Lab/EAR_VAE)]
17
 
18
  This repository contains the official inference code for εar-VAE, aa 44.1 kHz music signal reconstruction model that rethinks and optimizes VAE training for audio. It targets two common weaknesses in existing open-source VAEs—phase accuracy and stereophonic spatial representation—by aligning objectives with auditory perception and introducing phase-aware training. Experiments show substantial improvements across diverse metrics, with particular strength in high-frequency harmonics and spatial characteristics.
19
 
 
 
 
 
 
 
 
 
 
20
  Why εar-VAE:
21
  - 🎧 Perceptual alignment: A K-weighting perceptual filter is applied before loss computation to better match human hearing.
22
  - 🔁 Phase-aware objectives: Two novel phase losses
 
148
  - **[Stability AI's Stable Audio Tools](https://github.com/Stability-AI/stable-audio-tools)**: For providing a foundational framework and tools for audio generation.
149
  - **[Descript's Audio Codec](https://github.com/descriptinc/descript-audio-codec)**: For the weight-normed convolusional layers
150
 
151
+ Their contributions have been invaluable to the development of εar-VAE.