File size: 1,715 Bytes
8680102
 
 
9ca52e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8680102
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: apache-2.0
---
# DACVAE
VAE version of the [Descript Audio Codec](https://github.com/descriptinc/descript-audio-codec), which has a continuous latent space.
Descript Audio Codec (DAC) is a high fidelity general neural audio codec, introduced in the paper titled **High-Fidelity Audio Compression with Improved RVQGAN**.
Most code is adopted from the open-source repo [DAC](https://github.com/descriptinc/descript-audio-codec)

### Installation
```bash
$ pip install git+https://github.com/facebookresearch/dacvae
```

### Usage

```python
from dacvae import DACVAE
import torchaudio


model = DACVAE.load("facebook/dacvae-watermarked")
wav, sample_rate = torchaudio.load("<path to audio file>")
# Resample to expected sample rate
resampled = torchaudio.functional.resample(wav, sample_rate, model.sample_rate)
# Convert stero to mono (if applicable)
resampled = resampled.mean(dim=0, keepdim=True)
# Expected shape is batch x 1 x samples
model_input = resampled.unsqueeze(0)
encoded = model.encode(model_input)
# `decoded` shape is `batch x 1 x samples`
decoded = model.decode(encoded)
```

## Watermarking

The DAC-VAE decoder has been integrated with [Audioseal](https://github.com/facebookresearch/audioseal) to ensure all audios generated
contain watermarks that are verifiable independently. We develop a new watermarking model with adapted architecture specifically for
DAC-VAE to optimize the high-fidelity outcome. We also plan to release the detector API. Stay tuned !

## Contributing

See [contributing](CONTRIBUTING.md) and [code of conduct](CODE_OF_CONDUCT.md) for more information.

## License

This project is licensed under the SAM License - see the [LICENSE](LICENSE) file for details.