You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Gaia Suite: Llama-OuteTTS-1.0-1B - Ewe (ee)

The model is a gaia suite on local languages and this model is adapted to Ewe (ʋegbe).

This is a fine-tuned version of OuteAI/Llama-OuteTTS-1.0-1B specifically trained to synthesize speech in the Ewe language. The model was fine-tuned using the Unsloth library with 16-bit LoRA adapters (Rank 64) for memory-efficient and fast training.

Model Details

  • Model Type: Text-to-Speech (TTS) Auto-Regressive Language Model
  • Language(s): Ewe (ee)
  • Base Model: OuteAI/Llama-OuteTTS-1.0-1B
  • Training Dataset: google/WaxalNLP (Ewe TTS subset)
  • Fine-Tuning Method: LoRA
  • Framework: Hugging Face transformers, trl, unsloth
  • License: CC-BY-4.0 (Attribution required)

Intended Use

This model is intended for generating Ewe speech from text. It is suitable for:

  • Accessibility tools for Ewe speakers
  • Educational applications and language learning
  • Voice assistants and read-aloud features in Ewe

Citation & Attribution

If you use this model in your research, applications, or projects, you must cite and attribute Junior Adenyo.

Limitations & Preprocessing

  • Text Normalization: Like many TTS models, this model struggles with raw numbers, acronyms, and special symbols. It is highly recommended to spell out numbers and dates in Ewe (e.g., convert 240 to its Ewe word equivalent) before feeding the text to the model.
  • Ewe Orthography: Ensure the input text correctly uses Ewe specific characters (Ɖ, Ɛ, Ƒ, Ɣ, Ŋ, Ɔ, Ʋ, ɖ, ɛ, ƒ, ɣ, ŋ, ɔ, ʋ) as the tokenizer has been explicitly resized to support them.

Usage (with OuteTTS and Unsloth)

import torch
import re
from unsloth import FastModel

# Load the fine-tuned model
model, tokenizer = FastModel.from_pretrained(
    model_name="analist/oute_ewe_r64_16bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=False,
)
FastModel.for_inference(model)

# Prepare your Ewe text
input_text = "Ya ʋuduʋudu si ƒe kpekpeme anɔ abe agbadroƒe blaatɔ̄ le gaƒoƒo ɖeka me ene la, aƒo."
formated_text = "<|text_start|>" + input_text + "<|text_end|>"
prompt = "\n".join([
    "<|im_start|>",
    formated_text,
    "<|audio_start|><|global_features_start|>",
])

model_inputs = tokenizer([prompt], return_tensors="pt").to("cuda")

# Generate audio tokens
with torch.inference_mode():
    with torch.amp.autocast('cuda', dtype=model.dtype):
        generated_ids = model.generate(
            **model_inputs,
            temperature=0.1, 
            top_k=40,
            top_p=0.9,
            repetition_penalty=1.0, 
            min_p=0.05,
            max_new_tokens=4096,
        )

# Decode audio tokens to audio codes
decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
c1 = list(map(int, re.findall(r"<\|c1_(\d+)\|>", decoded_output)))
c2 = list(map(int, re.findall(r"<\|c2_(\d+)\|>", decoded_output)))

t = min(len(c1), len(c2))
audio_tokens = [c1[:t], c2[:t]]

# Note: To decode the generated tokens into a waveform, 
# you will need the DAC (Descript Audio Codec) interface from the OuteTTS library.
# from outetts.dac.interface import DacInterface
# dac = DacInterface()
# audio_waveform = dac.decode(torch.tensor([audio_tokens], dtype=torch.int64).to(dac.device))

Training Procedure

  • Batch Size: 2 (with Gradient Accumulation steps = 8)
  • Learning Rate: 5e-5
  • Epochs: 6
  • Optimizer: adamw_8bit
  • Hardware: Trained on a single NVIDIA RTX PRO 6000 Blackwell Edition.

Acknowledgements

  • Model architecture by OuteAI.
  • Dataset provided by Google's WaxalNLP project.
  • Fine-tuning powered by Unsloth.
Downloads last month
177
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for analist/oute_ewe_r64_16bit

Finetuned
(6)
this model

Dataset used to train analist/oute_ewe_r64_16bit

Collection including analist/oute_ewe_r64_16bit