Jekyll2000 commited on
Commit
8b6843e
Β·
verified Β·
1 Parent(s): 2fcd263

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -13
README.md CHANGED
@@ -1,42 +1,43 @@
1
  ---
2
  title: Haseeb's TTS
3
- emoji: 🎧
4
  colorFrom: indigo
5
  colorTo: purple
6
  sdk: streamlit
7
- sdk_version: 1.32.0
8
  python_version: '3.10'
9
  app_file: app.py
10
  pinned: false
11
  license: apache-2.0
12
  thumbnail: >-
13
- https://cdn-uploads.huggingface.co/production/uploads/652ac2e92aa5b27c77cba196/YwfGGlu6hJYzYGiwjPHbX.png
14
  ---
15
 
16
  # 🎧 Haseeb's TTS (Audiobook MP3 Generator)
17
 
18
  Generate audiobook-style narration using **Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice** with a Streamlit UI built for long chapters.
19
 
20
- ## Why Transformers from source?
21
- This model uses a newer architecture (`qwen3_tts`). If your Space installs an older Transformers release, it may fail with:
 
 
 
22
 
23
- > "Transformers does not recognize this architecture"
24
-
25
- To fix that, this Space installs **Transformers from GitHub** (latest) and uses `trust_remote_code=True`.
26
 
27
  ## Features
28
  - βœ… **MP3 output** (no ffmpeg needed)
29
  - βœ… **Batch mode**: upload multiple `.txt` files β†’ get multiple MP3s + **ZIP download**
30
  - βœ… **Long chapters (10,000+ chars)** via chunking + stitching
31
- - βœ… **Language Support** (dropdown steering)
32
- - βœ… **Voices / Speakers** (auto-detected if exposed + custom speaker field)
33
- - βœ… **Instruction Control** (style/emotion/pacing prompt)
34
 
35
  ## How to use
36
 
37
  ### Single chapter
38
  1. Paste text (or upload a single `.txt`)
39
- 2. Pick language, voice/speaker (optional), instruction
40
  3. Click **Generate MP3**
41
 
42
  ### Batch mode
@@ -47,7 +48,7 @@ To fix that, this Space installs **Transformers from GitHub** (latest) and uses
47
 
48
  ## Tips for audiobooks
49
  - Chunk size: **1200–1800 chars** is usually stable for long narration.
50
- - Add silence between chunks: **200–350 ms** reduces audible joins.
51
  - If memory is tight, reduce:
52
  - chunk size
53
  - `max_new_tokens`
 
1
  ---
2
  title: Haseeb's TTS
3
+ emoji: πŸš€
4
  colorFrom: indigo
5
  colorTo: purple
6
  sdk: streamlit
7
+ sdk_version: 1.54.0
8
  python_version: '3.10'
9
  app_file: app.py
10
  pinned: false
11
  license: apache-2.0
12
  thumbnail: >-
13
+ https://cdn-uploads.huggingface.co/production/uploads/652ac2e92aa5b27c77cba196/6Y7vGO0SQfVaCj9CYXzzf.png
14
  ---
15
 
16
  # 🎧 Haseeb's TTS (Audiobook MP3 Generator)
17
 
18
  Generate audiobook-style narration using **Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice** with a Streamlit UI built for long chapters.
19
 
20
+ ## Why `qwen-tts` instead of `transformers.pipeline()`?
21
+ The model uses the `qwen3_tts` architecture. Some Transformers builds in hosted environments may not recognize it.
22
+ This Space uses Qwen’s official **`qwen-tts`** package which supports:
23
+ - `generate_custom_voice(text, language, speaker, instruct, ...)`
24
+ - `get_supported_speakers()` / `get_supported_languages()`
25
 
26
+ (As shown in Qwen’s official Qwen3-TTS repo docs.) :contentReference[oaicite:1]{index=1}
 
 
27
 
28
  ## Features
29
  - βœ… **MP3 output** (no ffmpeg needed)
30
  - βœ… **Batch mode**: upload multiple `.txt` files β†’ get multiple MP3s + **ZIP download**
31
  - βœ… **Long chapters (10,000+ chars)** via chunking + stitching
32
+ - βœ… **Language Support** (dropdown; auto-populated from the model when possible)
33
+ - βœ… **Voices / Speakers** (auto-populated from the model when possible)
34
+ - βœ… **Instruction Control** (style/emotion/pacing)
35
 
36
  ## How to use
37
 
38
  ### Single chapter
39
  1. Paste text (or upload a single `.txt`)
40
+ 2. Choose language, speaker, instruction
41
  3. Click **Generate MP3**
42
 
43
  ### Batch mode
 
48
 
49
  ## Tips for audiobooks
50
  - Chunk size: **1200–1800 chars** is usually stable for long narration.
51
+ - Silence between chunks: **200–350 ms** reduces audible joins.
52
  - If memory is tight, reduce:
53
  - chunk size
54
  - `max_new_tokens`