Kailing-Leifang commited on
Commit
d35a6a0
·
verified ·
1 Parent(s): fb33bb7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: PersonaFlow
3
+ emoji: 🎭
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 5.9.1
8
+ app_file: app.py
9
+ pinned: false
10
+ hf_oauth: true
11
+ license: apache-2.0
12
+ short_description: Speak with AI characters that have distinct personalities
13
+ tags:
14
+ - voice
15
+ - audio
16
+ - tts
17
+ - stt
18
+ - character
19
+ - roleplay
20
+ ---
21
+
22
+ # 🎭 PersonaFlow
23
+
24
+ **Interactive Audio Character Demo** - Speak with AI characters that have distinct personalities, voices, and animated portraits.
25
+
26
+ ## Features
27
+
28
+ - **Voice Input**: Speak into your microphone (up to 10 seconds)
29
+ - **Multiple Characters**: Choose from 3 distinct personalities
30
+ - 🚀 **The Visionary** - Bold, dramatic, futuristic
31
+ - 🤔 **The Skeptic** - Dry, questioning, sardonic
32
+ - 🌟 **The Guide** - Warm, helpful, encouraging
33
+ - **Unique Voices**: Each character has a distinct voice
34
+ - **Animated Portraits**: Visual feedback with lip-sync animation
35
+ - **Conversation History**: Track your dialogue with each character
36
+
37
+ ## How It Works
38
+
39
+ 1. **Select a character** from the dropdown
40
+ 2. **Click the microphone** and speak your message
41
+ 3. **Listen** to the character's response with their unique voice
42
+ 4. **Continue** the conversation or switch characters
43
+
44
+ ## Technology
45
+
46
+ - **STT**: distil-whisper/distil-large-v3 (faster-whisper backend)
47
+ - **LLM**: Qwen/Qwen2.5-3B-Instruct
48
+ - **TTS**: Kokoro-82M with multiple voice options
49
+ - **Frontend**: Gradio with custom CSS animations
50
+
51
+ ## Latency Target
52
+
53
+ | Component | Target |
54
+ |-----------|--------|
55
+ | STT | <500ms |
56
+ | LLM | <400ms |
57
+ | TTS | <300ms |
58
+ | **Total** | **<1.5s** |
59
+
60
+ ---
61
+
62
+ Built for engaging 1-2 minute voice interactions with AI personalities.