Spaces:

medallo
/

xttsv2

Configuration error

App Files Files Community

medallo commited on Apr 28, 2025

Commit

2675c18

verified ·

1 Parent(s): ff5aea1

Upload 10 files

Browse files

Files changed (10) hide show

Dockerfile +44 -0
README.md +88 -14
apple_silicon_requirements.txt +189 -0
gitattributes +35 -0
install.bat +10 -0
install.sh +13 -0
requirements.txt +7 -0
start-container.sh +6 -0
start.bat +5 -0
start.sh +9 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,44 @@

+# syntax=docker/dockerfile:1
+FROM python:3.11-slim-bookworm AS base
+ARG APP_NAME=xtts-finetune-webui
+ARG CUDA_VER=cu121
+ARG GID=966
+ARG UID=966
+ARG WHISPER_MODEL="large-v3"
+# Environment
+ENV APP_NAME=$APP_NAME \
+    CUDA_VER=$CUDA_VER \
+    WHISPER_MODEL=$WHISPER_MODEL
+# User configuration
+ENV HOME /app/$APP_NAME
+RUN groupadd -r app -g $GID && \
+    useradd --no-log-init -m -r -g app app -u $UID
+# Prepare file-system
+RUN mkdir -p /app/server && chown -R $UID:$GID /app
+COPY --chown=$UID:$GID *.py *.sh *.txt *.md /app/server/
+ADD --chown=$UID:$GID utils /app/server/utils
+# Enter environment and install dependencies
+WORKDIR /app/server
+USER $UID:$GID
+ENV NVIDIA_VISIBLE_DEVICES=all PATH=$PATH:$HOME/.local/bin
+# Install nvidia-pyindex & nvidia-cudnn for libcudnn_ops_infer.so.8
+# See: https://github.com/SYSTRAN/faster-whisper/issues/516
+RUN pip3 install --user --no-cache-dir nvidia-pyindex && \
+    pip3 install --user --no-cache-dir nvidia-cudnn && \
+    pip3 install --user --no-cache-dir torch torchvision torchaudio \
+        --index-url https://download.pytorch.org/whl/$CUDA_VER && \
+    pip3 install --user --no-cache-dir -r requirements.txt --no-cache-dir && \
+    python3 -c "import os; from faster_whisper import WhisperModel; WhisperModel(os.environ['WHISPER_MODEL'], device='cpu', compute_type='int8')"
+# Ports and servername
+EXPOSE 5003
+ENV GRADIO_ANALYTICS_ENABLED="False"
+CMD [ "bash", "start-container.sh"]

README.md CHANGED Viewed

@@ -1,14 +1,88 @@
----
-title: Xttsv2
-emoji: 🏃
-colorFrom: purple
-colorTo: gray
-sdk: gradio
-sdk_version: 5.27.0
-app_file: app.py
-pinned: false
-license: apache-2.0
-short_description: v2
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# xtts-finetune-webui
+This webui is a slightly modified copy of the [official webui](https://github.com/coqui-ai/TTS/pull/3296) for finetune xtts.
+If you are looking for an option for normal XTTS use look here [https://github.com/daswer123/xtts-webui](https://github.com/daswer123/xtts-webui)
+## TODO
+- [ ] Add the ability to use via console
+## Key features:
+### Data processing
+1. Updated faster-whisper to 0.10.0 with the ability to select a larger-v3 model.
+2. Changed output folder to output folder inside the main folder.
+3. If there is already a dataset in the output folder and you want to add new data, you can do so by simply adding new audio, what was there will not be processed again and the new data will be automatically added
+4. Turn on VAD filter
+5. After the dataset is created, a file is created that specifies the language of the dataset. This file is read before training so that the language always matches. It is convenient when you restart the interface
+### Fine-tuning XTTS Encoder
+1. Added the ability to select the base model for XTTS, as well as when you re-training does not need to download the model again.
+2. Added ability to select custom model as base model during training, which will allow finetune already finetune model.
+3. Added possibility to get optimized version of the model for 1 click ( step 2.5, put optimized version in output folder).
+4. You can choose whether to delete training folders after you have optimized the model
+5. When you optimize the model, the example reference audio is moved to the output folder
+6. Checking for correctness of the specified language and dataset language
+### Inference
+1. Added possibility to customize infer settings during model checking.
+### Other
+1. If you accidentally restart the interface during one of the steps, you can load data to additional buttons
+2. Removed the display of logs as it was causing problems when restarted
+3. The finished result is copied to the ready folder, these are fully finished files, you can move them anywhere and use them as a standard model
+4. Added support for finetune Japanese
+## Changes in webui
+### 1 - Data processing
+![image](https://github.com/daswer123/xtts-finetune-webui/assets/22278673/8f09b829-098b-48f5-9668-832e7319403b)
+### 2 - Fine-tuning XTTS Encoder
+![image](https://github.com/daswer123/xtts-finetune-webui/assets/22278673/897540d9-3a6b-463c-abb8-261c289cc929)
+### 3 - Inference
+![image](https://github.com/daswer123/xtts-finetune-webui/assets/22278673/aa05bcd4-8642-4de4-8f2f-bc0f5571af63)
+## Google colab
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DrewThomasson/xtts-finetune-webui/blob/main/notebook/xtts_finetune_webui.ipynb)
+## 🐳 Run in Docker
+```docker
+docker run -it --gpus all --pull always -p 7860:7860 --platform=linux/amd64 athomasson2/fine_tune_xtts:huggingface python app.py
+```
+## Install
+1. Make sure you have `Cuda` installed
+2. `git clone https://github.com/daswer123/xtts-finetune-webui`
+3. `cd xtts-finetune-webui`
+4. `pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118`
+5. `pip install -r requirements.txt`
+### If you're using Windows
+1. First start `install.bat`
+2. To start the server start `start.bat`
+3. Go to the local address `127.0.0.1:5003`
+### On Linux
+1. Run `bash install.sh`
+2. To start the server start `start.sh`
+3. Go to the local address `127.0.0.1:5003`
+### On Apple Silicon Mac (python 3.10 env)
+1. Run `pip install --no-deps -r apple_silicon_requirements.txt`
+2. To start the server `python xtts_demo.py`
+3. Go to the local address `127.0.0.1:5003`
+~

apple_silicon_requirements.txt ADDED Viewed

	@@ -0,0 +1,189 @@

+absl-py==2.1.0
+aiofiles==23.2.1
+aiohttp==3.9.5
+aiosignal==1.3.1
+altair==5.3.0
+annotated-types==0.7.0
+anyascii==0.3.2
+anyio==3.7.1
+async-timeout==4.0.3
+attrs==23.2.0
+audioread==3.0.1
+av==12.2.0
+Babel==2.15.0
+bangla==0.0.2
+blinker==1.8.2
+blis==0.7.11
+bnnumerizer==0.0.2
+bnunicodenormalizer==0.1.7
+catalogue==2.0.10
+certifi==2024.7.4
+cffi==1.16.0
+charset-normalizer==3.3.2
+click==8.1.7
+cloudpathlib==0.16.0
+colorama==0.4.6
+coloredlogs==15.0.1
+confection==0.1.5
+contourpy==1.2.1
+coqpit==0.0.17
+coqui-tts==0.24.2
+coqui-tts-trainer==0.1.4
+ctranslate2==4.3.1
+cutlet==0.4.0
+cycler==0.12.1
+cymem==2.0.8
+Cython==3.0.10
+dateparser==1.1.8
+decorator==5.1.1
+dnspython==2.6.1
+docopt==0.6.2
+einops==0.8.0
+email_validator==2.2.0
+encodec==0.1.1
+exceptiongroup==1.2.2
+fastapi==0.103.1
+fastapi-cli==0.0.4
+faster-whisper==1.0.2
+ffmpy==0.3.2
+filelock==3.15.4
+Flask==3.0.3
+flatbuffers==24.3.25
+fonttools==4.53.1
+frozenlist==1.4.1
+fsspec==2024.6.1
+fugashi==1.3.2
+g2pkk==0.1.2
+gradio==4.44.1
+gradio_client==1.3.0
+grpcio==1.64.1
+gruut==2.4.0
+gruut-ipa==0.13.0
+gruut_lang_de==2.0.1
+gruut_lang_en==2.0.1
+gruut_lang_es==2.0.1
+gruut_lang_fr==2.0.2
+h11==0.14.0
+hangul-romanize==0.1.0
+httpcore==1.0.5
+httptools==0.6.1
+httpx==0.27.0
+huggingface-hub==0.23.5
+humanfriendly==10.0
+idna==3.7
+importlib_resources==6.4.0
+inflect==7.3.1
+itsdangerous==2.2.0
+jaconv==0.4.0
+jamo==0.4.1
+jieba==0.42.1
+Jinja2==3.1.4
+joblib==1.4.2
+jsonlines==1.2.0
+jsonschema==4.23.0
+jsonschema-specifications==2023.12.1
+kiwisolver==1.4.5
+langcodes==3.4.0
+language_data==1.2.0
+lazy_loader==0.4
+librosa==0.10.2.post1
+llvmlite==0.43.0
+marisa-trie==1.2.0
+Markdown==3.6
+markdown-it-py==3.0.0
+MarkupSafe==2.1.5
+matplotlib==3.8.4
+mdurl==0.1.2
+mecab-python3==1.0.9
+mojimoji==0.0.13
+more-itertools==10.3.0
+mpmath==1.3.0
+msgpack==1.0.8
+multidict==6.0.5
+murmurhash==1.0.10
+networkx==2.8.8
+nltk==3.8.1
+num2words==0.5.13
+numba==0.60.0
+numpy==1.26.4
+onnxruntime==1.18.1
+orjson==3.10.6
+packaging==24.1
+pandas==1.5.3
+pillow==10.4.0
+platformdirs==4.2.2
+pooch==1.8.2
+preshed==3.0.9
+protobuf==4.25.3
+psutil==6.0.0
+pycparser==2.22
+pydantic==2.3.0
+pydantic_core==2.6.3
+pydub==0.25.1
+pygame==2.6.0
+Pygments==2.18.0
+pynndescent==0.5.13
+pyparsing==3.1.2
+pypinyin==0.51.0
+pysbd==0.3.4
+python-crfsuite==0.9.10
+python-dateutil==2.9.0.post0
+python-dotenv==1.0.1
+python-multipart==0.0.9
+pytz==2024.1
+PyYAML==6.0.1
+referencing==0.35.1
+regex==2024.5.15
+requests==2.32.3
+rich==13.7.1
+rpds-py==0.19.0
+ruff==0.5.2
+safetensors==0.4.3
+scikit-learn==1.5.1
+scipy==1.11.4
+semantic-version==2.10.0
+shellingham==1.5.4
+six==1.16.0
+smart-open==6.4.0
+sniffio==1.3.1
+soundfile==0.12.1
+soxr==0.3.7
+spacy==3.7.4
+spacy-legacy==3.0.12
+spacy-loggers==1.0.5
+srsly==2.4.8
+starlette==0.27.0
+SudachiDict-core==20240409
+SudachiPy==0.6.8
+sympy==1.13.0
+tensorboard==2.17.0
+tensorboard-data-server==0.7.2
+thinc==8.2.5
+threadpoolctl==3.5.0
+tokenizers==0.19.1
+tomlkit==0.12.0
+toolz==0.12.1
+torch==2.3.1
+torchaudio==2.3.1
+tqdm==4.66.4
+trainer==0.0.36
+transformers==4.42.4
+TTS==0.21.3
+typeguard==4.3.0
+typer==0.12.5
+typing_extensions==4.12.2
+tzdata==2024.1
+tzlocal==5.2
+umap-learn==0.5.6
+Unidecode==1.3.8
+unidic-lite==1.0.8
+urllib3==2.2.2
+uvicorn==0.30.1
+uvloop==0.19.0
+wasabi==1.1.3
+watchfiles==0.22.0
+weasel==0.3.4
+websockets==11.0.3
+Werkzeug==3.0.3
+wrapt==1.16.0
+yarl==1.9.4

gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

install.bat ADDED Viewed

	@@ -0,0 +1,10 @@

+@echo off
+python -m venv venv
+call venv/scripts/activate
+pip install -r .\requirements.txt
+pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118
+python xtts_demo.py

install.sh ADDED Viewed

	@@ -0,0 +1,13 @@

+#!/bin/bash
+# Create a Python virtual environment
+python -m venv venv
+# Activate the virtual environment
+source venv/bin/activate
+# Install other dependencies from requirements.txt
+pip install -r requirements.txt
+pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118
+python xtts_demo.py

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+faster_whisper==1.0.3
+gradio==5.1.0
+spacy==3.7.5
+coqui-tts[languages] == 0.24.2
+cutlet
+fugashi[unidic-lite]

start-container.sh ADDED Viewed

	@@ -0,0 +1,6 @@

+#!/bin/bash
+# Enable resolution of libcudnn_ops_infer.so.8
+export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/app/xtts-finetune-webui/.local/lib/python3.11/site-packages/torch/lib:/app/xtts-finetune-webui/.local/lib/python3.11/site-packages/nvidia/cudnn/lib"
+python3 xtts_demo.py

start.bat ADDED Viewed

	@@ -0,0 +1,5 @@

+@echo off
+call venv/scripts/activate
+python xtts_demo.py

start.sh ADDED Viewed

	@@ -0,0 +1,9 @@

+#!/bin/bash
+# Create a Python virtual environment
+python -m venv venv
+# Activate the virtual environment
+source venv/bin/activate
+python xtts_demo.py