--- license: apache-2.0 language: - th --- # My first TTS [Finetune colab](https://colab.research.google.com/drive/1FdCg-fjwiwrkAHXXqYGWq--Lmz_J10NI?usp=sharing) ## Example Code ```python import torch from transformers import VitsTokenizer, VitsModel, set_seed import scipy.io.wavfile device = "cuda" if torch.cuda.is_available() else "cpu" model = VitsModel.from_pretrained("meguscx/VITS-TH-Model").to(device) tokenizer = VitsTokenizer.from_pretrained("meguscx/VITS-TH-Model") text = "การเรียนรู้ภาษาใหม่ช่วยเปิดโลกทัศน์ให้กว้างขึ้น" inputs = tokenizer(text=text, return_tensors="pt").to(device) set_seed(456) with torch.no_grad(): outputs = model(**inputs) waveform = outputs.waveform[0].cpu().numpy() scipy.io.wavfile.write( "test.wav", rate=model.config.sampling_rate, data=waveform ) print("Saved successfully.") ``` this model train 6 hr with only 1094 voice data (single speaker) so voice maybe not too good and sometimes sound weird or unnatural because dataset is small ;-; ## Sample Audio