| # Container Template for SoundsRight Subnet Miners | |
| This repository contains a contanierized version of [SGMSE+](https://huggingface.co/sp-uhh/speech-enhancement-sgmse) and serves as a tutorial for miners to format their models on [Bittensor's](https://bittensor.com/) [SoundsRight Subnet](https://github.com/synapsec-ai/SoundsRightSubnet). The branches `DENOISING_16000HZ` and `DEREVERBERATION_16000HZ` contain SGMSE fitted with the approrpriate checkpoints for denoising and dereverberation tasks at 16kHz, respectively. | |
| This container has only been tested with **Ubuntu 24.04** and **CUDA 12.6**. It may run on other configurations, but it is not guaranteed. | |
| To run the container, first configure NVIDIA Container Toolkit and generate a CDI specification. Follow the instructions to download the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) with Apt. | |
| Next, follow the instructions for [generating a CDI specification](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html). | |
| Verify that the CDI specification was done correctly with: | |
| ``` | |
| $ nvidia-ctk cdi list | |
| ``` | |
| You should see this in your output: | |
| ``` | |
| nvidia.com/gpu=all | |
| nvidia.com/gpu=0 | |
| ``` | |
| If you are running podman as root, run the following command to start the container: | |
| Run the container with: | |
| ``` | |
| podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --user root --name modelapi -p 6500:6500 modelapi | |
| ``` | |
| Access logs with: | |
| ``` | |
| podman logs -f modelapi | |
| ``` | |
| If you are running the container rootless, there are a few more changes to make: | |
| First, modify `/etc/nvidia-container-runtime/config.toml` and set the following parameters: | |
| ``` | |
| [nvidia-container-cli] | |
| no-cgroups = true | |
| [nvidia-container-runtime] | |
| debug = "/tmp/nvidia-container-runtime.log" | |
| ``` | |
| You can also run the following command to achieve the same result: | |
| ``` | |
| $ sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place | |
| ``` | |
| Run the container with: | |
| ``` | |
| podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --volume /usr/local/cuda-12.6:/usr/local/cuda-12.6 --user 10002:10002 --name modelapi -p 6500:6500 modelapi | |
| ``` | |
| Access logs with: | |
| ``` | |
| podman logs -f modelapi | |
| ``` | |
| Running the container will spin up an API with the following endpoints: | |
| 1. `/status/` : Communicates API status | |
| 2. `/prepare/` : Download model checkpoint and initialize model | |
| 3. `/upload-audio/` : Upload audio files, save to noisy audio directory | |
| 4. `/enhance/` : Initialize model, enhance audio files, save to enhanced audio directory | |
| 5. `/download-enhanced/` : Download enhanced audio files | |
| By default the API will use host `0.0.0.0` and port `6500`. | |
| ### References | |
| 1. **Welker, Simon; Richter, Julius; Gerkmann, Timo** | |
| *Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain*. | |
| Proceedings of *Interspeech 2022*, 2022, pp. 2928β2932. | |
| [DOI: 10.21437/Interspeech.2022-10653](https://doi.org/10.21437/Interspeech.2022-10653) | |
| 2. **Richter, Julius; Welker, Simon; Lemercier, Jean-Marie; Lay, Bunlong; Gerkmann, Timo** | |
| *Speech Enhancement and Dereverberation with Diffusion-based Generative Models*. | |
| *IEEE/ACM Transactions on Audio, Speech, and Language Processing*, Vol. 31, 2023, pp. 2351β2364. | |
| [DOI: 10.1109/TASLP.2023.3285241](https://doi.org/10.1109/TASLP.2023.3285241) | |
| 3. **Richter, Julius; Wu, Yi-Chiao; Krenn, Steven; Welker, Simon; Lay, Bunlong; Watanabe, Shinjii; Richard, Alexander; Gerkmann, Timo** | |
| *EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation*. | |
| Proceedings of *ISCA Interspeech*, 2024, pp. 4873β4877. | |