Adia_TTS / README.md

Update README.md

4d8ddaa verified 11 months ago

4.12 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- wo
	base_model:
	- parler-tts/parler-tts-mini-multilingual-v1.1
	pipeline_tag: text-to-speech
	---

	# Documentation de Adia_TTS

	## Introduction
	Adia_TTS est une extension monolocuteur wolof du modèle `parler-tts-mini-multilingual-v1.1`. Il a été entraîné sur un ensemble de données de 40 heures en wolof et affiné pendant 100 epochs, soit environ 168 heures d'entraînement.

	Le modèle génère une voix plus naturelle et fluide, comparable à celle d'un humain.

	## Installation
	L'utilisation d'Adia_TTS est simple. Tout d'abord, installez la bibliothèque `Parler-TTS` en exécutant la commande suivante :
	```sh
	pip install git+https://github.com/huggingface/parler-tts.git
	```

	## Utilisation
	Adia_TTS suit les mêmes interfaces que les versions précédentes de `Parler-TTS`. La qualité de la voix peut être ajustée en modifiant la description fournie au modèle, en précisant des critères comme : voix claire, monotone, sans bruit de fond, etc.

	### Exemple d'utilisation
	Le code suivant montre comment utiliser Adia_TTS pour générer un fichier audio :

	```py
	import torch
	from parler_tts import ParlerTTSForConditionalGeneration
	from transformers import AutoTokenizer
	from IPython.display import Audio
	import soundfile as sf

	# Détection de l'appareil disponible
	device = "cuda:0" if torch.cuda.is_available() else "cpu"

	# Chargement du modèle et du tokenizer
	model = ParlerTTSForConditionalGeneration.from_pretrained("Moustapha91/parler-tts-wolof").to(device)
	tokenizer = AutoTokenizer.from_pretrained("Moustapha91/parler-tts-wolof")

	# Définition du texte d'entrée (exemple en wolof)
	prompt = """Am na yoon yu bari yoo mëna amee xaalis ngir sa projet: Liggéeyandoo ak: Sàkku jàppale ci yokkute mbay ak transformation produit yi. Defar bu baax sa projet te jokkoo ak ñoom. Banqi yi: Demal ci banq yi ngir ñu may la crédit ngir tambali sa projet"""

	# Description influençant la qualité de la synthèse vocale
	description = "A crystal clear and distinct voice, with a moderate reading rate that facilitates understanding. The tone is monotonous, without variations or inflections, which provides a uniform listening experience. The voice is free of background noise and allows for continuous reading, without inappropriate pauses, thus ensuring a constant and pleasant flow."

	# Tokenisation des entrées
	input_ids = tokenizer(description, return_tensors="pt").input_ids.to(device)
	prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

	# Génération de l'audio
	generation = model.generate(
	input_ids=input_ids,
	prompt_input_ids=prompt_input_ids,
	)

	audio_arr = generation.cpu().numpy().squeeze()
	sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
	Audio(audio_arr, rate=model.config.sampling_rate)
	```

	## Ajustement de la Qualité Audio
	La qualité de l'audio peut être modifiée en ajustant la description fournie au modèle. Voici quelques exemples :

	```py
	description = "Aida speaks slowly with a very clear recording but a monotone voice."
	description = "Adia's speech is very quiet and monotone, delivered with a very small amount of discernible expression."
	description = "Adia's voice comes across as very monotone, speaking slowly with very clear sounds and no background noise."
	```

	En variant la description, vous pouvez obtenir une sortie vocale différente en termes de clarté, d'expression et de débit.

	## Références
	```
	@misc{CONCREE-2024-Adia_TTS,
	author = {CONCREE},
	title = {Adia_TTS},
	year = {2024},
	publisher = {Hugging Face},
	journal = {Hugging Face repository},
	howpublished = {\url{https://huggingface.co/CONCREE/Adia_TTS}}
	}

	@misc{lyth2024natural,
	title={Natural language guidance of high-fidelity text-to-speech with synthetic annotations},
	author={Dan Lyth and Simon King},
	year={2024},
	eprint={2402.01912},
	archivePrefix={arXiv},
	primaryClass={cs.SD}
	}
	```

	## Licence
	Ce modèle est publié sous la licence permissive Apache 2.0, permettant son utilisation libre et sa modification sous certaines conditions.