Upload 35 files

7764a84 verified 10 months ago

3.7 kB

	# Container Template for SoundsRight Subnet Miners

	This repository contains a contanierized version of [SGMSE+](https://huggingface.co/sp-uhh/speech-enhancement-sgmse) and serves as a tutorial for miners to format their models on [Bittensor's](https://bittensor.com/) [SoundsRight Subnet](https://github.com/synapsec-ai/SoundsRightSubnet). The branches `DENOISING_16000HZ` and `DEREVERBERATION_16000HZ` contain SGMSE fitted with the approrpriate checkpoints for denoising and dereverberation tasks at 16kHz, respectively.

	This container has only been tested with Ubuntu 24.04 and CUDA 12.6. It may run on other configurations, but it is not guaranteed.

	To run the container, first configure NVIDIA Container Toolkit and generate a CDI specification. Follow the instructions to download the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) with Apt.

	Next, follow the instructions for [generating a CDI specification](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html).

	Verify that the CDI specification was done correctly with:
	```
	$ nvidia-ctk cdi list
	```
	You should see this in your output:
	```
	nvidia.com/gpu=all
	nvidia.com/gpu=0
	```

	If you are running podman as root, run the following command to start the container:

	Run the container with:
	```
	podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --user root --name modelapi -p 6500:6500 modelapi
	```
	Access logs with:
	```
	podman logs -f modelapi
	```
	If you are running the container rootless, there are a few more changes to make:

	First, modify `/etc/nvidia-container-runtime/config.toml` and set the following parameters:
	```
	[nvidia-container-cli]
	no-cgroups = true

	[nvidia-container-runtime]
	debug = "/tmp/nvidia-container-runtime.log"
	```
	You can also run the following command to achieve the same result:
	```
	$ sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place
	```

	Run the container with:
	```
	podman build -t modelapi . && podman run -d --device nvidia.com/gpu=all --volume /usr/local/cuda-12.6:/usr/local/cuda-12.6 --user 10002:10002 --name modelapi -p 6500:6500 modelapi
	```
	Access logs with:
	```
	podman logs -f modelapi
	```
	Running the container will spin up an API with the following endpoints:
	1. `/status/` : Communicates API status
	2. `/prepare/` : Download model checkpoint and initialize model
	3. `/upload-audio/` : Upload audio files, save to noisy audio directory
	4. `/enhance/` : Initialize model, enhance audio files, save to enhanced audio directory
	5. `/download-enhanced/` : Download enhanced audio files

	By default the API will use host `0.0.0.0` and port `6500`.

	### References

	1. Welker, Simon; Richter, Julius; Gerkmann, Timo
	Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain.
	Proceedings of Interspeech 2022, 2022, pp. 2928–2932.
	[DOI: 10.21437/Interspeech.2022-10653](https://doi.org/10.21437/Interspeech.2022-10653)

	2. Richter, Julius; Welker, Simon; Lemercier, Jean-Marie; Lay, Bunlong; Gerkmann, Timo
	Speech Enhancement and Dereverberation with Diffusion-based Generative Models.
	IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 31, 2023, pp. 2351–2364.
	[DOI: 10.1109/TASLP.2023.3285241](https://doi.org/10.1109/TASLP.2023.3285241)

	3. Richter, Julius; Wu, Yi-Chiao; Krenn, Steven; Welker, Simon; Lay, Bunlong; Watanabe, Shinjii; Richard, Alexander; Gerkmann, Timo
	EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation.
	Proceedings of ISCA Interspeech, 2024, pp. 4873–4877.