Praha-Labs commited on
Commit
8a207a3
·
verified ·
1 Parent(s): e3a3918

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -17
README.md CHANGED
@@ -1,22 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- base_model: Vyvo/VyvoTTS-LFM2-Neuvillette
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - lfm2
8
- - trl
9
- license: apache-2.0
10
- language:
11
- - en
12
- ---
 
 
 
 
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** Praha-Labs
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** Vyvo/VyvoTTS-LFM2-Neuvillette
19
 
20
- This lfm2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
+ ---
2
+ base_model:
3
+ - LiquidAI/LFM2-350M
4
+ tags:
5
+ - text-generation-inference
6
+ - transformers
7
+ - unsloth
8
+ - lfm2
9
+ - trl
10
+ license: apache-2.0
11
+ language:
12
+ - en
13
+ - ml
14
+ pipeline_tag: text-to-speech
15
+ ---
16
+
17
+ # Malayalam TTS Model (LFM2-350M Fine-tuned)
18
+
19
+ This repository contains a fine-tuned **Malayalam Text-to-Speech (TTS)** model based on **LFM2-350M**, trained using [VyvoTTS](https://github.com/Vyvo-Labs/VyvoTTS) (LLM-based TTS framework) and [Unsloth](https://github.com/unslothai/unsloth).
20
+
21
  ---
22
+ Malayalam TTS — 24 kHz (LLM + SNAC Codec)
23
+
24
+ High-quality Malayalam text-to-speech model targeting natural pronunciation and clean prosody at 24 kHz, using a discrete audio codec (SNAC 24 kHz) for waveform reconstruction. Designed for lightweight deployment (~350M parameters) with GPU/CPU support.
25
+
26
+ Status: v0.1 — stable inference, strong pronunciation, limited emotional expressiveness. Roadmap includes expressive styles and non‑verbal cues (laughter, giggles, breaths).
27
+
28
+ Highlights
29
+
30
+ Language: Malayalam (with support for basic English loanwords).
31
+
32
+ Sample Rate: 24 kHz, mono.
33
+
34
+ Codec: [SNAC 24 kHz] for fast decoding.
35
+
36
+ Model Size: ~350M parameters (small/efficient).
37
 
38
+ Strengths: Clear, non‑robotic pronunciation; punctuation‑aware phrasing.
39
 
40
+ Known Limits: Emotion range is narrow; limited style transfer; no speaker cloning in v0.1.
 
 
41
 
42
+ ## 📖 Model Details
43
+ - **Base Model:** LFM2-350M
44
+ - **Language:** Malayalam
45
+ - **Dataset:** [ai4bharat/rasa](https://huggingface.co/datasets/ai4bharat/rasa) (Malayalam subset)
46
+ - **Training:** 10 epochs, ~77k steps
47
+ - **Frameworks Used:** VyvoTTS, Unsloth
48
 
49
+ ---