| # Container Template for SoundsRight Subnet Miners |
|
|
| This repository contains a contanierized version of [SGMSE+](https://huggingface.co/sp-uhh/speech-enhancement-sgmse) and serves as a tutorial for miners to format their models on [Bittensor's](https://bittensor.com/) [SoundsRight Subnet](https://github.com/synapsec-ai/SoundsRightSubnet). The branches `DENOISING_16000HZ` and `DEREVERBERATION_16000HZ` contain SGMSE fitted with the approrpriate checkpoints for denoising and dereverberation tasks at 16kHz, respectively. |
|
|
| This container has only been tested with **Ubuntu 24.04** and **CUDA 12.6**. It may run on other configurations, but it is not guaranteed. |
|
|
| To run the container, first configure NVIDIA Container Toolkit and generate a CDI specification. Follow the instructions to download the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) with Apt. |
|
|
| Next, follow the instructions for [generating a CDI specification](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html). |
|
|
| Verify that the CDI specification was done correctly with: |
| ``` |
| $ nvidia-ctk cdi list |
| ``` |
| You should see this in your output: |
| ``` |
| nvidia.com/gpu=all |
| nvidia.com/gpu=0 |
| ``` |
|
|
| If you are running podman as root, run the following command to start the container: |
|
|
| Run the container with: |
| ``` |
| podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --user root --name modelapi -p 6500:6500 modelapi |
| ``` |
| Access logs with: |
| ``` |
| podman logs -f modelapi |
| ``` |
| If you are running the container rootless, there are a few more changes to make: |
|
|
| First, modify `/etc/nvidia-container-runtime/config.toml` and set the following parameters: |
| ``` |
| [nvidia-container-cli] |
| no-cgroups = true |
| |
| [nvidia-container-runtime] |
| debug = "/tmp/nvidia-container-runtime.log" |
| ``` |
| You can also run the following command to achieve the same result: |
| ``` |
| $ sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place |
| ``` |
|
|
| Run the container with: |
| ``` |
| podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --volume /usr/local/cuda-12.6:/usr/local/cuda-12.6 --user 10002:10002 --name modelapi -p 6500:6500 modelapi |
| ``` |
| Access logs with: |
| ``` |
| podman logs -f modelapi |
| ``` |
| Running the container will spin up an API with the following endpoints: |
| 1. `/status/` : Communicates API status |
| 2. `/prepare/` : Download model checkpoint and initialize model |
| 3. `/upload-audio/` : Upload audio files, save to noisy audio directory |
| 4. `/enhance/` : Initialize model, enhance audio files, save to enhanced audio directory |
| 5. `/download-enhanced/` : Download enhanced audio files |
|
|
| By default the API will use host `0.0.0.0` and port `6500`. |
|
|
| ### References |
|
|
| 1. **Welker, Simon; Richter, Julius; Gerkmann, Timo** |
| *Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain*. |
| Proceedings of *Interspeech 2022*, 2022, pp. 2928–2932. |
| [DOI: 10.21437/Interspeech.2022-10653](https://doi.org/10.21437/Interspeech.2022-10653) |
|
|
| 2. **Richter, Julius; Welker, Simon; Lemercier, Jean-Marie; Lay, Bunlong; Gerkmann, Timo** |
| *Speech Enhancement and Dereverberation with Diffusion-based Generative Models*. |
| *IEEE/ACM Transactions on Audio, Speech, and Language Processing*, Vol. 31, 2023, pp. 2351–2364. |
| [DOI: 10.1109/TASLP.2023.3285241](https://doi.org/10.1109/TASLP.2023.3285241) |
|
|
| 3. **Richter, Julius; Wu, Yi-Chiao; Krenn, Steven; Welker, Simon; Lay, Bunlong; Watanabe, Shinjii; Richard, Alexander; Gerkmann, Timo** |
| *EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation*. |
| Proceedings of *ISCA Interspeech*, 2024, pp. 4873–4877. |
|
|