Praha-Labs
/

LFM-MALAYALAM-TTS-v0.1

text-generation

text-generation-inference

Model card Files Files and versions

LFM-MALAYALAM-TTS-v0.1 / README.md

Praha-Labs's picture

Update README.md

bb791de verified 3 months ago

|

history blame contribute delete

1.73 kB

	---
	base_model:
	- LiquidAI/LFM2-350M
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- lfm2
	- trl
	license: apache-2.0
	language:
	- en
	- ml
	pipeline_tag: text-to-speech
	---

	# Malayalam TTS Model (LFM2-350M Fine-tuned)

	This repository contains a fine-tuned Malayalam Text-to-Speech (TTS) model based on LFM2-350M, trained using [VyvoTTS](https://github.com/Vyvo-Labs/VyvoTTS) (LLM-based TTS framework) and [Unsloth](https://github.com/unslothai/unsloth).

	---
	Malayalam TTS — 24 kHz (LLM + SNAC Codec)

	High-quality Malayalam text-to-speech model targeting natural pronunciation and clean prosody at 24 kHz, using a discrete audio codec (SNAC 24 kHz) for waveform reconstruction. Designed for lightweight deployment (~350M parameters) with GPU/CPU support.

	Status: v0.1 — stable inference, strong pronunciation, limited emotional expressiveness. Roadmap includes expressive styles and non‑verbal cues (laughter, giggles, breaths).

	✨ Highlights

	Language: Malayalam (with support for basic English loanwords).

	Sample Rate: 24 kHz, mono.

	Codec: [SNAC 24 kHz] for fast decoding.

	Model Size: ~350M parameters (small/efficient).

	Strengths: Clear, non‑robotic pronunciation; punctuation‑aware phrasing.

	Known Limits: Emotion range is narrow; limited style transfer; no speaker cloning in v0.1.

	## 📖 Model Details
	- Base Model: LFM2-350M
	- Language: Malayalam
	- Dataset: [ai4bharat/rasa](https://huggingface.co/datasets/ai4bharat/rasa) (Malayalam subset)
	- Training: 10 epochs, ~77k steps
	- Frameworks Used: VyvoTTS, Unsloth

	---
	## 🔮 Future Work
	- Emotion and expressive style support
	- Non-verbal cues (laughter, giggles, breaths)
	- Multi-speaker extension