LFM-MALAYALAM-TTS-v0.1 / README.md

Praha-Labs

Update README.md

bb791de verified 3 months ago

preview code

raw

history blame contribute delete

1.73 kB

metadata

base_model:
  - LiquidAI/LFM2-350M
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - lfm2
  - trl
license: apache-2.0
language:
  - en
  - ml
pipeline_tag: text-to-speech

Malayalam TTS Model (LFM2-350M Fine-tuned)

This repository contains a fine-tuned Malayalam Text-to-Speech (TTS) model based on LFM2-350M, trained using VyvoTTS (LLM-based TTS framework) and Unsloth.

Malayalam TTS — 24 kHz (LLM + SNAC Codec)

High-quality Malayalam text-to-speech model targeting natural pronunciation and clean prosody at 24 kHz, using a discrete audio codec (SNAC 24 kHz) for waveform reconstruction. Designed for lightweight deployment (~350M parameters) with GPU/CPU support.

Status: v0.1 — stable inference, strong pronunciation, limited emotional expressiveness. Roadmap includes expressive styles and non‑verbal cues (laughter, giggles, breaths).

✨ Highlights

Language: Malayalam (with support for basic English loanwords).

Sample Rate: 24 kHz, mono.

Codec: [SNAC 24 kHz] for fast decoding.

Model Size: ~350M parameters (small/efficient).

Strengths: Clear, non‑robotic pronunciation; punctuation‑aware phrasing.

Known Limits: Emotion range is narrow; limited style transfer; no speaker cloning in v0.1.

📖 Model Details

Base Model: LFM2-350M
Language: Malayalam
Dataset: ai4bharat/rasa (Malayalam subset)
Training: 10 epochs, ~77k steps
Frameworks Used: VyvoTTS, Unsloth

🔮 Future Work

Emotion and expressive style support
Non-verbal cues (laughter, giggles, breaths)
Multi-speaker extension