GAIA
Collection
All tools needed to make African and low resource languages speak up ! • 9 items • Updated
The model is a gaia suite on local languages and this model is adapted to Ewe (ʋegbe).
This is a fine-tuned version of OuteAI/Llama-OuteTTS-1.0-1B specifically trained to synthesize speech in the Ewe language. The model was fine-tuned using the Unsloth library with 16-bit LoRA adapters (Rank 64) for memory-efficient and fast training.
ee)OuteAI/Llama-OuteTTS-1.0-1Bgoogle/WaxalNLP (Ewe TTS subset)transformers, trl, unslothThis model is intended for generating Ewe speech from text. It is suitable for:
If you use this model in your research, applications, or projects, you must cite and attribute Junior Adenyo.
240 to its Ewe word equivalent) before feeding the text to the model.import torch
import re
from unsloth import FastModel
# Load the fine-tuned model
model, tokenizer = FastModel.from_pretrained(
model_name="analist/oute_ewe_r64_16bit",
max_seq_length=2048,
dtype=None,
load_in_4bit=False,
)
FastModel.for_inference(model)
# Prepare your Ewe text
input_text = "Ya ʋuduʋudu si ƒe kpekpeme anɔ abe agbadroƒe blaatɔ̄ le gaƒoƒo ɖeka me ene la, aƒo."
formated_text = "<|text_start|>" + input_text + "<|text_end|>"
prompt = "\n".join([
"<|im_start|>",
formated_text,
"<|audio_start|><|global_features_start|>",
])
model_inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
# Generate audio tokens
with torch.inference_mode():
with torch.amp.autocast('cuda', dtype=model.dtype):
generated_ids = model.generate(
**model_inputs,
temperature=0.1,
top_k=40,
top_p=0.9,
repetition_penalty=1.0,
min_p=0.05,
max_new_tokens=4096,
)
# Decode audio tokens to audio codes
decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
c1 = list(map(int, re.findall(r"<\|c1_(\d+)\|>", decoded_output)))
c2 = list(map(int, re.findall(r"<\|c2_(\d+)\|>", decoded_output)))
t = min(len(c1), len(c2))
audio_tokens = [c1[:t], c2[:t]]
# Note: To decode the generated tokens into a waveform,
# you will need the DAC (Descript Audio Codec) interface from the OuteTTS library.
# from outetts.dac.interface import DacInterface
# dac = DacInterface()
# audio_waveform = dac.decode(torch.tensor([audio_tokens], dtype=torch.int64).to(dac.device))
Base model
OuteAI/Llama-OuteTTS-1.0-1B