meguscx
/

VITS-TH-Model

Model card Files Files and versions

VITS-TH-Model / README.md

meguscx's picture

Update README.md

14435a9 verified 24 days ago

|

history blame contribute delete

1.31 kB

	---
	license: apache-2.0
	language:
	- th
	---
	# My first TTS

	[Finetune colab](https://colab.research.google.com/drive/1FdCg-fjwiwrkAHXXqYGWq--Lmz_J10NI?usp=sharing)


	## Example Code

	```python
	import torch
	from transformers import VitsTokenizer, VitsModel, set_seed
	import scipy.io.wavfile

	device = "cuda" if torch.cuda.is_available() else "cpu"

	model = VitsModel.from_pretrained("meguscx/VITS-TH-Model").to(device)
	tokenizer = VitsTokenizer.from_pretrained("meguscx/VITS-TH-Model")

	text = "การเรียนรู้ภาษาใหม่ช่วยเปิดโลกทัศน์ให้กว้างขึ้น"

	inputs = tokenizer(text=text, return_tensors="pt").to(device)

	set_seed(456)

	with torch.no_grad():
	outputs = model(**inputs)

	waveform = outputs.waveform[0].cpu().numpy()

	scipy.io.wavfile.write(
	"test.wav",
	rate=model.config.sampling_rate,
	data=waveform
	)

	print("Saved successfully.")
	```

	this model train 6 hr with only 1094 voice data (single speaker)
	so voice maybe not too good and sometimes sound weird or unnatural
	because dataset is small ;-;


	## Sample Audio
	<audio controls>
	<source src="https://huggingface.co/meguscx/VITS-TH-Model/resolve/main/test.wav" type="audio/wav">
	Your browser does not support the audio element.
	</audio>