File size: 3,703 Bytes
561a1cf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# Container Template for SoundsRight Subnet Miners
This repository contains a contanierized version of [SGMSE+](https://huggingface.co/sp-uhh/speech-enhancement-sgmse) and serves as a tutorial for miners to format their models on [Bittensor's](https://bittensor.com/) [SoundsRight Subnet](https://github.com/synapsec-ai/SoundsRightSubnet). The branches `DENOISING_16000HZ` and `DEREVERBERATION_16000HZ` contain SGMSE fitted with the approrpriate checkpoints for denoising and dereverberation tasks at 16kHz, respectively.
This container has only been tested with **Ubuntu 24.04** and **CUDA 12.6**. It may run on other configurations, but it is not guaranteed.
To run the container, first configure NVIDIA Container Toolkit and generate a CDI specification. Follow the instructions to download the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) with Apt.
Next, follow the instructions for [generating a CDI specification](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html).
Verify that the CDI specification was done correctly with:
```
$ nvidia-ctk cdi list
```
You should see this in your output:
```
nvidia.com/gpu=all
nvidia.com/gpu=0
```
If you are running podman as root, run the following command to start the container:
Run the container with:
```
podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --user root --name modelapi -p 6500:6500 modelapi
```
Access logs with:
```
podman logs -f modelapi
```
If you are running the container rootless, there are a few more changes to make:
First, modify `/etc/nvidia-container-runtime/config.toml` and set the following parameters:
```
[nvidia-container-cli]
no-cgroups = true
[nvidia-container-runtime]
debug = "/tmp/nvidia-container-runtime.log"
```
You can also run the following command to achieve the same result:
```
$ sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place
```
Run the container with:
```
podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --volume /usr/local/cuda-12.6:/usr/local/cuda-12.6 --user 10002:10002 --name modelapi -p 6500:6500 modelapi
```
Access logs with:
```
podman logs -f modelapi
```
Running the container will spin up an API with the following endpoints:
1. `/status/` : Communicates API status
2. `/prepare/` : Download model checkpoint and initialize model
3. `/upload-audio/` : Upload audio files, save to noisy audio directory
4. `/enhance/` : Initialize model, enhance audio files, save to enhanced audio directory
5. `/download-enhanced/` : Download enhanced audio files
By default the API will use host `0.0.0.0` and port `6500`.
### References
1. **Welker, Simon; Richter, Julius; Gerkmann, Timo**
*Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain*.
Proceedings of *Interspeech 2022*, 2022, pp. 2928–2932.
[DOI: 10.21437/Interspeech.2022-10653](https://doi.org/10.21437/Interspeech.2022-10653)
2. **Richter, Julius; Welker, Simon; Lemercier, Jean-Marie; Lay, Bunlong; Gerkmann, Timo**
*Speech Enhancement and Dereverberation with Diffusion-based Generative Models*.
*IEEE/ACM Transactions on Audio, Speech, and Language Processing*, Vol. 31, 2023, pp. 2351–2364.
[DOI: 10.1109/TASLP.2023.3285241](https://doi.org/10.1109/TASLP.2023.3285241)
3. **Richter, Julius; Wu, Yi-Chiao; Krenn, Steven; Welker, Simon; Lay, Bunlong; Watanabe, Shinjii; Richard, Alexander; Gerkmann, Timo**
*EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation*.
Proceedings of *ISCA Interspeech*, 2024, pp. 4873–4877.
|