austinekurian commited on
Commit
a31c6d0
·
verified ·
1 Parent(s): ea79f5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -41
README.md CHANGED
@@ -1,45 +1,42 @@
1
- # മലയാളം Text → AI Voice (Free)
2
-
3
- A free web app (Hugging Face Space, Gradio) that converts **Malayalam** text to speech using the **AI4Bharat VITS** model.
4
-
5
- ## How it works
6
- - Loads the multi‑lingual Indian **VITS TTS** model `ai4bharat/vits_rasa_13`, which includes **Malayalam** voices and multiple **styles** (NEWS, BOOK, etc.).
7
- - Renders a simple Gradio UI: paste Malayalam text → click **Generate** → download audio.
8
-
9
- > Model reference: AI4Bharat VITS model with Malayalam support and style/speaker IDs.
10
- > Piper/Sherpa‑ONNX alternative for Malayalam also exists (`ml_IN-arjun`), if you prefer an ONNX path.
11
-
12
- ## Deploy (Hugging Face Spaces)
13
- 1. Create a new Space → **Gradio**.
14
- 2. Upload these files: `app.py`, `requirements.txt`, `README.md`.
15
- 3. The Space will build and start automatically.
16
- 4. Share the public URL.
17
 
18
- ## Usage
19
- - Default speaker is **MAL_F (11)**.
20
- - Try styles like **NEWS (10)** for crisp reading, **BOOK (3)** for long‑form, **ALEXA (0)** for neutral.
21
-
22
- ## Local run (optional)
23
- ```bash
24
- python -m venv .venv && source .venv/bin/activate
25
- pip install -r requirements.txt
26
- python app.py
27
- ```
28
-
29
- ## Licensing
30
- - App code: MIT (see below).
31
- - **Model license**: please review the license on the model page before commercial use.
32
-
33
- ### MIT License (app code)
34
- Copyright (c) 2025
35
- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction...
36
- ```
37
- (standard MIT terms)
38
- ```
39
 
 
40
 
41
- ## New features
42
- - **Prosody sliders:** speaking rate (0.5–1.5) & pitch (−4…+4 semitones). Implemented via resampling (approximate).
43
- - **Batch paragraphs:** split on blank lines → one file per paragraph × style.
44
- - **MP3 alongside WAV:** via `pydub` + ffmpeg (present on Spaces). Falls back to WAV if MP3 fails.
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
+ ---
3
+ title: Malayalam Text AI Voice (Free)
4
+ emoji: 🗣️
5
+ colorFrom: indigo
6
+ colorTo: green
7
+ sdk: gradio
8
+ sdk_version: 4.44.0
9
+ app_file: app.py
10
+ python_version: 3.10
11
+ runtime: python
12
+ pinned: false
13
+ license: mit
14
+ models:
15
+ - ai4bharat/vits_rasa_13
16
+ ---
 
 
 
 
 
 
17
 
18
+ # മലയാളം Text → AI Voice (Free)
19
 
20
+ A free web app (Hugging Face Space, Gradio) that converts **Malayalam** text to speech using the **AI4Bharat VITS** model.
 
 
 
21
 
22
+ ## Features
23
+ - **Multiple voice styles** (ALEXA, NEWS, BOOK, etc.)
24
+ - **Prosody controls**: Speaking rate & pitch (approximate via resampling)
25
+ - **Batch paragraphs**: Split text by blank line → one file per paragraph × style
26
+ - **WAV + MP3** output (MP3 requires `ffmpeg`)
27
+
28
+ ## Deploy / Run
29
+ 1. Ensure the files below are present in the repository:
30
+ - `app.py`
31
+ - `requirements.txt`
32
+ - `packages.txt` *(contains `ffmpeg` for MP3)*
33
+ - `LICENSE`
34
+ 2. Accept access to the gated model **ai4bharat/vits_rasa_13** on its model page (click “Access repository / Agree”).
35
+ 3. If you still get permission errors, add a read token as a Space secret:
36
+ - **Settings → Variables and secrets → New secret**
37
+ - Name: `HF_TOKEN` | Value: your Hugging Face read token
38
+ 4. Restart the Space.
39
+
40
+ ## Notes
41
+ - Prosody controls are approximate (client-side resampling). For true SSML prosody, consider Azure AI Speech Malayalam neural voice (ml-IN-SobhanaNeural).
42
+ ``