| | --- |
| | title: CC VAD |
| | emoji: 🐢 |
| | colorFrom: purple |
| | colorTo: blue |
| | sdk: docker |
| | pinned: false |
| | license: apache-2.0 |
| | --- |
| | |
| | Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
| | ## CC VAD |
| |
|
| |
|
| | ### datasets |
| |
|
| | ```text |
| | |
| | AISHELL (15G) |
| | https://openslr.trmal.net/resources/33/ |
| | |
| | AISHELL-3 (19G) |
| | http://www.openslr.org/93/ |
| | |
| | DNS3 |
| | https://github.com/microsoft/DNS-Challenge/blob/master/download-dns-challenge-3.sh |
| | 噪音数据来源于 DEMAND, FreeSound, AudioSet. |
| | |
| | MS-SNSD |
| | https://github.com/microsoft/MS-SNSD |
| | 噪音数据来源于 DEMAND, FreeSound. |
| | |
| | MUSAN |
| | https://www.openslr.org/17/ |
| | 其中包含 music, noise, speech. |
| | music 是一些纯音乐, noise 包含 free-sound, sound-bible, sound-bible部分也许可以做为补充部分. |
| | 总的来说, 有用的不部不多, 可能噪音数据仍然需要自己收集为主, 更加可靠. |
| | |
| | CHiME-4 |
| | https://www.chimechallenge.org/challenges/chime4/download.html |
| | |
| | freesound |
| | https://freesound.org/ |
| | |
| | AudioSet |
| | https://research.google.com/audioset/index.html |
| | ``` |
| |
|
| |
|
| | ### ### 创建训练容器 |
| |
|
| | ```text |
| | 在容器中训练模型,需要能够从容器中访问到 GPU,参考: |
| | https://hub.docker.com/r/ollama/ollama |
| | |
| | docker run -itd \ |
| | --name cc_vad \ |
| | --network host \ |
| | --gpus all \ |
| | --privileged \ |
| | --ipc=host \ |
| | -v /data/tianxing/HuggingDatasets/nx_noise/data:/data/tianxing/HuggingDatasets/nx_noise/data \ |
| | -v /data/tianxing/PycharmProjects/cc_vad:/data/tianxing/PycharmProjects/cc_vad \ |
| | python:3.12 /bin/bash |
| | |
| | |
| | 查看GPU |
| | nvidia-smi |
| | watch -n 1 -d nvidia-smi |
| | |
| | |
| | ``` |
| |
|
| | ```text |
| | 在容器中访问 GPU |
| | |
| | 参考: |
| | https://blog.csdn.net/footless_bird/article/details/136291344 |
| | 步骤: |
| | # 安装 |
| | yum install -y nvidia-container-toolkit |
| | |
| | # 编辑文件 /etc/docker/daemon.json |
| | cat /etc/docker/daemon.json |
| | { |
| | "data-root": "/data/lib/docker", |
| | "default-runtime": "nvidia", |
| | "runtimes": { |
| | "nvidia": { |
| | "path": "/usr/bin/nvidia-container-runtime", |
| | "runtimeArgs": [] |
| | } |
| | }, |
| | "registry-mirrors": [ |
| | "https://docker.m.daocloud.io", |
| | "https://dockerproxy.com", |
| | "https://docker.mirrors.ustc.edu.cn", |
| | "https://docker.nju.edu.cn" |
| | ] |
| | } |
| | |
| | # 重启 docker |
| | systemctl restart docker |
| | systemctl daemon-reload |
| | |
| | # 测试容器内能否访问 GPU. |
| | docker run --gpus all python:3.12-slim nvidia-smi |
| | |
| | # 通过这种方式启动容器, 在容器中, 可以查看到 GPU. 但是容器中没有 GPU驱动 nvidia-smi 不工作. |
| | docker run -it --privileged python:3.12-slim /bin/bash |
| | apt update |
| | apt install -y pciutils |
| | lspci | grep -i nvidia |
| | #00:08.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1) |
| | |
| | # 网上看的是这种启动容器的方式, 但是进去后仍然是 nvidia-smi 不工作. |
| | docker run \ |
| | --device /dev/nvidia0:/dev/nvidia0 \ |
| | --device /dev/nvidiactl:/dev/nvidiactl \ |
| | --device /dev/nvidia-uvm:/dev/nvidia-uvm \ |
| | -v /usr/local/nvidia:/usr/local/nvidia \ |
| | -it --privileged python:3.12-slim /bin/bash |
| | |
| | |
| | # 这种方式进入容器, nvidia-smi 可以工作. 应该关键是 --gpus all 参数. |
| | docker run -itd --gpus all --name open_unsloth python:3.12-slim /bin/bash |
| | docker run -itd --gpus all --name Qwen2-7B-Instruct python:3.12-slim /bin/bash |
| | |
| | ``` |
| |
|