okoliechykwuka commited on
Commit
9f031f6
·
1 Parent(s): 6ef4e14

Deploy 9jaLingo bot Docker Space

Browse files
Files changed (13) hide show
  1. .dockerignore +14 -0
  2. .gitignore +51 -0
  3. Dockerfile +27 -0
  4. README.md +174 -11
  5. data/faq.json +646 -0
  6. main.py +118 -0
  7. pyproject.toml +25 -0
  8. requirements.txt +7 -0
  9. src/__init__.py +0 -0
  10. src/chat_service.py +106 -0
  11. src/chatbot.py +58 -0
  12. src/ingest.py +87 -0
  13. uv.lock +0 -0
.dockerignore ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.pyc
2
+ *.pyo
3
+ *.pyd
4
+ .Python
5
+ env
6
+ venv
7
+ .venv
8
+ .env
9
+ .git
10
+ .gitignore
11
+ .pytest_cache
12
+ .coverage
13
+ __pycache__
14
+ data/chroma_db/
.gitignore ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ build/
8
+ develop-eggs/
9
+ dist/
10
+ downloads/
11
+ eggs/
12
+ .eggs/
13
+ lib/
14
+ lib64/
15
+ parts/
16
+ sdist/
17
+ var/
18
+ wheels/
19
+ *.egg-info/
20
+ .installed.cfg
21
+ *.egg
22
+
23
+ # Virtual Environment
24
+ venv/
25
+ env/
26
+ ENV/
27
+
28
+ # Environment Variables
29
+ .env
30
+ .env.local
31
+ .env.*.local
32
+
33
+ # IDE
34
+ .idea/
35
+ .vscode/
36
+ *.swp
37
+ *.swo
38
+
39
+ # Logs
40
+ *.log
41
+ logs/
42
+ chroma_operations.log
43
+
44
+ # Operating System
45
+ .DS_Store
46
+ Thumbs.db
47
+
48
+ # Project-specific
49
+ test.ipynb
50
+ .ipynb_checkpoints/
51
+ data/chroma_db/
Dockerfile ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ # Hugging Face Spaces runs nicely with uid 1000
4
+ RUN useradd -m -u 1000 user
5
+
6
+ ENV PYTHONDONTWRITEBYTECODE=1 \
7
+ PYTHONUNBUFFERED=1 \
8
+ PIP_NO_CACHE_DIR=1 \
9
+ PATH="/home/user/.local/bin:$PATH"
10
+
11
+ WORKDIR /app
12
+
13
+ RUN apt-get update \
14
+ && apt-get install -y --no-install-recommends build-essential \
15
+ && rm -rf /var/lib/apt/lists/*
16
+
17
+ COPY --chown=user requirements.txt /app/requirements.txt
18
+
19
+ USER user
20
+ RUN pip install --user --upgrade pip \
21
+ && pip install --user -r /app/requirements.txt
22
+
23
+ COPY --chown=user . /app
24
+
25
+ # Docker Spaces must listen on 7860
26
+ EXPOSE 7860
27
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,11 +1,174 @@
1
- ---
2
- title: Chatbot
3
- emoji: 🌍
4
- colorFrom: purple
5
- colorTo: pink
6
- sdk: docker
7
- pinned: false
8
- short_description: A RAG-based customer support assistant for the 9jaLingoAI
9
- ---
10
-
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 9jaLingo Bot
2
+
3
+ A minimal RAG-based customer support assistant for the 9jaLingo Voice AI platform, built with FastAPI, Chroma, and Ollama.
4
+
5
+ The bot is designed to answer user questions about 9jaLingo products and workflows, including Text-to-Speech (TTS), Speech-to-Text (STT), Voice Cloning, Voice Over production, API usage, and support operations.
6
+
7
+ ## Features
8
+
9
+ - Intelligent support chat for 9jaLingo platform questions
10
+ - Retrieval-augmented responses using Chroma vector database
11
+ - Local embeddings with Ollama
12
+ - Conversation memory by `thread_id`
13
+ - FastAPI backend with `/chat` and `/stream` endpoints
14
+
15
+ ## Core Platform Coverage
16
+
17
+ The support bot FAQ and retrieval context includes answers for:
18
+
19
+ - Product overview and account onboarding
20
+ - TTS voices, languages, and usage patterns
21
+ - STT transcription workflows and output formats
22
+ - Voice cloning requirements and best practices
23
+ - Voice over workflows for creators and agencies
24
+ - API authentication, request patterns, and integration guidance
25
+ - Billing, quotas, and usage troubleshooting
26
+ - Support and escalation guidance
27
+
28
+ ## Prerequisites
29
+
30
+ - Python 3.12+
31
+ - uv (recommended package manager)
32
+ - Ollama installed locally
33
+ - Ollama model pulled locally:
34
+ - `embeddinggemma`
35
+ - API keys:
36
+ - Optional API keys only if your chosen Ollama setup requires them
37
+
38
+ ## Installation
39
+
40
+ 1. Clone repo and enter bot folder:
41
+ ```bash
42
+ cd 9jalingo_bot
43
+ ```
44
+
45
+ 2. Install dependencies:
46
+ ```bash
47
+ uv sync
48
+ ```
49
+
50
+ 3. Configure environment variables in `.env`:
51
+ ```env
52
+ GOOGLE_API_KEY=your_google_api_key
53
+ TAVILY_API_KEY=your_tavily_api_key
54
+ OLLAMA_BASE_URL=http://localhost:11434
55
+ OLLAMA_EMBEDDING_MODEL=embeddinggemma
56
+ ```
57
+
58
+ 4. Confirm Ollama models are available:
59
+ ```bash
60
+ ollama list
61
+ ```
62
+
63
+ ## Build Vector Database
64
+
65
+ From the project root (`9jalingo_bot`), run your vector DB bootstrap flow (if needed) so `data/faq.json` is indexed into Chroma.
66
+
67
+ Chroma persists locally under:
68
+
69
+ - `data/chroma_db/`
70
+
71
+ ## Run API
72
+
73
+ ```bash
74
+ uv run uvicorn main:app --reload --host 0.0.0.0 --port 8000
75
+ ```
76
+
77
+ ## API Endpoints
78
+
79
+ ### Health
80
+
81
+ ```http
82
+ GET /health
83
+ ```
84
+
85
+ ### Chat
86
+
87
+ ```http
88
+ POST /chat
89
+ ```
90
+
91
+ Request body:
92
+
93
+ ```json
94
+ {
95
+ "message": "How do I start with voice cloning on 9jaLingo?",
96
+ "thread_id": "support-user-42"
97
+ }
98
+ ```
99
+
100
+ ### Stream
101
+
102
+ ```http
103
+ POST /stream
104
+ ```
105
+
106
+ ## Project Structure
107
+
108
+ ```text
109
+ 9jalingo_bot/
110
+ ├── data/
111
+ │ └── faq.json
112
+ ├── src/
113
+ │ ├── chat_service.py
114
+ │ ├── chatbot.py
115
+ │ └── ingest.py
116
+ ├── rag/
117
+ │ ├── data/
118
+ │ └── chroma_db/
119
+ ├── main.py
120
+ ├── pyproject.toml
121
+ └── Readme.md
122
+ ```
123
+
124
+ ## Notes
125
+
126
+ - Embeddings and chat generation are handled through Ollama-backed components.
127
+ - The API uses the FAQ file in `data/faq.json` as the RAG knowledge source.
128
+ - Memory is kept in-process and keyed by `thread_id`.
129
+
130
+ ## Deploy to Hugging Face (Docker Space)
131
+
132
+ This project is Docker-based and currently installs Python dependencies from `requirements.txt` in the Dockerfile.
133
+
134
+ 1. Clone your Space repo:
135
+
136
+ ```bash
137
+ git clone https://huggingface.co/spaces/9jaLingo/chatbot
138
+ cd chatbot
139
+ ```
140
+
141
+ When prompted for password, use a Hugging Face access token with write permission.
142
+
143
+ 2. Install Hugging Face CLI with uv:
144
+
145
+ ```bash
146
+ uv tool install hf
147
+ ```
148
+
149
+ 3. (Optional) Verify/download Space files:
150
+
151
+ ```bash
152
+ hf download 9jaLingo/chatbot --repo-type=space
153
+ ```
154
+
155
+ 4. Copy this app into the Space repo root (important files):
156
+
157
+ - `Dockerfile`
158
+ - `requirements.txt`
159
+ - `main.py`
160
+ - `src/`
161
+ - `data/`
162
+ - `.dockerignore`
163
+
164
+ 5. Commit and push:
165
+
166
+ ```bash
167
+ git add Dockerfile requirements.txt main.py src data .dockerignore Readme.md
168
+ git commit -m "Deploy 9jaLingo bot Docker Space"
169
+ git push
170
+ ```
171
+
172
+ 6. Hugging Face Docker Space requirement:
173
+
174
+ - The app must listen on port `7860` (already set in `Dockerfile`).
data/faq.json ADDED
@@ -0,0 +1,646 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "question": "What is 9jaLingo platform?",
4
+ "answer": "9jaLingo is a voice AI platform for African language speech products, including Text-to-Speech (TTS), Speech-to-Text (STT), voice cloning, and voice-over workflows."
5
+ },
6
+ {
7
+ "question": "Who is 9jaLingo built for?",
8
+ "answer": "9jaLingo is built for developers, content creators, businesses, and teams that need realistic, localized voice experiences."
9
+ },
10
+ {
11
+ "question": "Which languages does 9jaLingo support?",
12
+ "answer": "9jaLingo supports voices across Hausa, Igbo, Yoruba, Nigerian Pidgin, and Nigerian English, with continued expansion."
13
+ },
14
+ {
15
+ "question": "Can I use 9jaLingo for production workloads?",
16
+ "answer": "Yes. 9jaLingo is designed for production use cases with API access, scalable workflows, and support options."
17
+ },
18
+ {
19
+ "question": "What core services are available on 9jaLingo?",
20
+ "answer": "Core services include Text-to-Speech, Speech-to-Text, voice cloning, and voice-over generation for media and product teams."
21
+ },
22
+ {
23
+ "question": "What is Text-to-Speech on 9jaLingo?",
24
+ "answer": "Text-to-Speech converts input text into natural-sounding speech using selected speakers, accents, and style settings."
25
+ },
26
+ {
27
+ "question": "Can I choose different speakers for TTS?",
28
+ "answer": "Yes. You can choose from many speakers and test which voice fits your brand story, app tone, or campaign style."
29
+ },
30
+ {
31
+ "question": "Can I generate Yoruba TTS audio?",
32
+ "answer": "Yes. You can generate Yoruba speech where supported by choosing a matching voice and submitting your text."
33
+ },
34
+ {
35
+ "question": "What audio formats does TTS support?",
36
+ "answer": "TTS output can be exported in WAV, PCM, MP3, FLAC, AAC, ALAC, and OGG, depending on endpoint and configuration."
37
+ },
38
+ {
39
+ "question": "Is TTS good for IVR and call prompts?",
40
+ "answer": "Yes. Many teams use 9jaLingo TTS for IVR prompts, virtual agents, onboarding flows, and customer help lines."
41
+ },
42
+ {
43
+ "question": "How do I optimize TTS output quality?",
44
+ "answer": "Use clean punctuation, shorter sentence blocks, language-appropriate text, and test multiple voices before final export."
45
+ },
46
+ {
47
+ "question": "Is there a TTS API for developers?",
48
+ "answer": "Yes. 9jaLingo provides API endpoints so developers can generate speech from text in automated pipelines."
49
+ },
50
+ {
51
+ "question": "Can I stream TTS audio in real time?",
52
+ "answer": "Streaming support depends on endpoint configuration and your plan, but low-latency responses are supported in many flows."
53
+ },
54
+ {
55
+ "question": "How do I authenticate TTS API requests?",
56
+ "answer": "Authenticate requests with your API key or token according to the platform docs, and keep credentials secure."
57
+ },
58
+ {
59
+ "question": "Can I batch-generate many TTS files?",
60
+ "answer": "Yes. You can queue batch text inputs and generate multiple files for campaigns, lessons, narration, and customer messaging."
61
+ },
62
+ {
63
+ "question": "Can TTS be integrated with microservices?",
64
+ "answer": "Yes. TTS can be integrated with microservices through standard HTTP API calls and background job handling."
65
+ },
66
+ {
67
+ "question": "What is best practice for TTS retries?",
68
+ "answer": "Use idempotent request design, retries with backoff, and request IDs so your app can recover without duplicate side effects."
69
+ },
70
+ {
71
+ "question": "What is Speech-to-Text on 9jaLingo?",
72
+ "answer": "Speech-to-Text converts spoken audio into written text for search, analytics, support, and workflow automation."
73
+ },
74
+ {
75
+ "question": "Can STT transcribe Igbo?",
76
+ "answer": "STT supports multiple local language scenarios. Select the closest language profile for best performance."
77
+ },
78
+ {
79
+ "question": "What file types are accepted for STT?",
80
+ "answer": "Common audio formats such as WAV, MP3, and M4A are generally supported, depending on endpoint settings."
81
+ },
82
+ {
83
+ "question": "Can STT handle long recordings?",
84
+ "answer": "Yes. Long recordings can be processed with chunking, async jobs, and post-merge strategies for reliability."
85
+ },
86
+ {
87
+ "question": "Can I use STT for call center recordings?",
88
+ "answer": "Yes. STT is commonly used to transcribe support calls and power QA, search, and customer insight analysis."
89
+ },
90
+ {
91
+ "question": "How do I improve STT transcription quality?",
92
+ "answer": "Use clear audio, reduce background noise, choose the correct language setting, and normalize volume before upload."
93
+ },
94
+ {
95
+ "question": "Can STT output timestamps?",
96
+ "answer": "Timestamp support may be available by endpoint and is useful for subtitles, editing, and media indexing workflows."
97
+ },
98
+ {
99
+ "question": "Can I build subtitles with 9jaLingo STT?",
100
+ "answer": "Yes. STT transcripts can be transformed into subtitle files for video publishing and accessibility."
101
+ },
102
+ {
103
+ "question": "Can STT detect multiple speakers?",
104
+ "answer": "Speaker handling capabilities vary by model flow and may require post-processing for best diarization output."
105
+ },
106
+ {
107
+ "question": "Can I process STT in near real time?",
108
+ "answer": "Yes. Near-real-time patterns are possible with streaming ingestion and incremental transcript handling."
109
+ },
110
+ {
111
+ "question": "Can STT be used for meeting notes?",
112
+ "answer": "Yes. Teams use STT to capture meetings, generate summaries, and improve documentation speed."
113
+ },
114
+ {
115
+ "question": "How should I store STT transcripts?",
116
+ "answer": "Store transcripts with metadata such as language, source ID, and timestamp, then index them for search and audit needs."
117
+ },
118
+ {
119
+ "question": "What is voice cloning on 9jaLingo?",
120
+ "answer": "Voice cloning creates a synthetic voice that matches a target speaker's tone, style, and cadence from approved sample audio."
121
+ },
122
+ {
123
+ "question": "Who can use voice cloning?",
124
+ "answer": "Creators, studios, brands, and product teams can use voice cloning after meeting consent and policy requirements."
125
+ },
126
+ {
127
+ "question": "Do I need permission to clone a voice?",
128
+ "answer": "Yes. Explicit rights and consent are required before cloning any voice, and unauthorized use is not allowed."
129
+ },
130
+ {
131
+ "question": "How much audio is needed for cloning?",
132
+ "answer": "Higher-quality and longer clean recordings usually improve clone quality, while short noisy samples reduce accuracy."
133
+ },
134
+ {
135
+ "question": "Can cloned voices be used for ads?",
136
+ "answer": "Yes. When licensing and consent terms are satisfied, cloned voices can be used in campaigns and branded content."
137
+ },
138
+ {
139
+ "question": "How do I get better clone quality?",
140
+ "answer": "Use studio-quality recordings, stable microphone distance, low noise, and consistent speaking pace in source material."
141
+ },
142
+ {
143
+ "question": "What is 9jaLingo voice-over service?",
144
+ "answer": "Voice-over service helps creators and teams produce polished narration for videos, ads, podcasts, explainers, and product content."
145
+ },
146
+ {
147
+ "question": "Can content creators use 9jaLingo for social videos?",
148
+ "answer": "Yes. Creators can generate consistent voice-overs quickly for shorts, reels, tutorials, and educational content."
149
+ },
150
+ {
151
+ "question": "Can agencies produce multilingual voice campaigns?",
152
+ "answer": "Yes. Agencies can produce voice tracks across supported languages and test voice options for each market."
153
+ },
154
+ {
155
+ "question": "Can I update script lines without re-recording everything?",
156
+ "answer": "Yes. You can regenerate only changed lines and keep production workflows fast and cost-efficient."
157
+ },
158
+ {
159
+ "question": "Can I build brand voice identity with 9jaLingo?",
160
+ "answer": "Yes. Teams can select or clone voices to maintain a consistent audio brand across channels."
161
+ },
162
+ {
163
+ "question": "Does 9jaLingo provide developer API access?",
164
+ "answer": "Yes. 9jaLingo offers developer APIs for TTS, STT, voice cloning, and related voice workflows."
165
+ },
166
+ {
167
+ "question": "Can I call 9jaLingo from Node.js?",
168
+ "answer": "Yes. You can call endpoints from Node.js environments using standard authentication and JSON payloads."
169
+ },
170
+ {
171
+ "question": "How do I protect my API key?",
172
+ "answer": "Store keys in secure environment variables, rotate regularly, and never expose secrets in client-side code."
173
+ },
174
+ {
175
+ "question": "Are webhooks available for async jobs?",
176
+ "answer": "Webhook-style completion patterns can be implemented for long-running tasks to improve app responsiveness."
177
+ },
178
+ {
179
+ "question": "Can I test endpoints before production?",
180
+ "answer": "Yes. Validate in staging with representative payloads and monitor response times before full rollout."
181
+ },
182
+ {
183
+ "question": "What API reliability practice is recommended?",
184
+ "answer": "Use retries, timeout controls, request IDs, monitoring, and fallback responses to keep user experience stable."
185
+ },
186
+ {
187
+ "question": "How does 9jaLingo pricing generally work?",
188
+ "answer": "Pricing depends on workload type, such as character count, audio duration, model usage, and selected service tier."
189
+ },
190
+ {
191
+ "question": "Can I track usage for cost control?",
192
+ "answer": "Yes. Usage tracking and reporting help teams monitor spend and optimize request patterns."
193
+ },
194
+ {
195
+ "question": "Are there limits on API requests?",
196
+ "answer": "Rate limits may apply by plan and help protect service reliability for all platform users."
197
+ },
198
+ {
199
+ "question": "Can I upgrade plans when usage grows?",
200
+ "answer": "Yes. Plans can be adjusted as product adoption increases or enterprise requirements expand."
201
+ },
202
+ {
203
+ "question": "Does 9jaLingo support enterprise security expectations?",
204
+ "answer": "Enterprise security practices are supported through secure access controls, operational monitoring, and policy-based workflows."
205
+ },
206
+ {
207
+ "question": "What governance checks should teams do before launch?",
208
+ "answer": "Teams should verify consent rights, data handling, retention policy, access control, and audit requirements before go-live."
209
+ },
210
+ {
211
+ "question": "How do I contact 9jaLingo support?",
212
+ "answer": "Use the official support channels on the product website or contact email to open a ticket."
213
+ },
214
+ {
215
+ "question": "What information should I send in a support ticket?",
216
+ "answer": "Include request ID, timestamp, endpoint, payload summary, expected output, and the exact error message."
217
+ },
218
+ {
219
+ "question": "Why is my audio generation taking longer than expected?",
220
+ "answer": "Long text, high concurrency, or temporary load can increase latency, so use async flow and retries."
221
+ },
222
+ {
223
+ "question": "What should I do when STT output is inaccurate?",
224
+ "answer": "Check audio quality, language setting, and speaking clarity, then retry with improved input audio."
225
+ },
226
+ {
227
+ "question": "What should I do when voice clone quality is weak?",
228
+ "answer": "Upload cleaner source audio with less noise and more consistent speech, then retrain or regenerate."
229
+ },
230
+ {
231
+ "question": "Can support help with integration debugging?",
232
+ "answer": "Yes. Support can guide endpoint usage, auth setup, and request validation for faster integration fixes."
233
+ },
234
+ {
235
+ "question": "How do I report a bug on the platform?",
236
+ "answer": "Submit reproducible steps, environment details, and logs so the team can investigate quickly."
237
+ },
238
+ {
239
+ "question": "Can I request a new language or voice?",
240
+ "answer": "Yes. You can submit feature requests and the team can review demand and roadmap fit."
241
+ },
242
+ {
243
+ "question": "Does 9jaLingo provide onboarding help for teams?",
244
+ "answer": "Yes. Onboarding support can include setup guidance, API walkthroughs, and best-practice recommendations."
245
+ },
246
+ {
247
+ "question": "Can I get help choosing between TTS, STT, and voice cloning?",
248
+ "answer": "Yes. Support can map your use case to the best workflow and propose phased implementation."
249
+ },
250
+ {
251
+ "question": "What is the credit system in 9jaLingo?",
252
+ "answer": "9jaLingo uses a credit-based system where $1 USD equals 1,000 credits, and 1 credit is worth $0.001."
253
+ },
254
+ {
255
+ "question": "How much does standard TTS cost per character?",
256
+ "answer": "Standard TTS costs 0.05 credits per character, which equals $0.05 per 1,000 characters."
257
+ },
258
+ {
259
+ "question": "What are the available subscription tiers?",
260
+ "answer": "There are three tiers: Starter (free), PAYG Lite at $10/month, and PAYG Pro at $50/month."
261
+ },
262
+ {
263
+ "question": "How many credits does the free Starter plan include?",
264
+ "answer": "The Starter plan includes 2,000 credits per month at no cost."
265
+ },
266
+ {
267
+ "question": "How many credits does the Lite plan include?",
268
+ "answer": "The PAYG Lite plan at $10/month includes 10,000 credits."
269
+ },
270
+ {
271
+ "question": "How many credits does the Pro plan include?",
272
+ "answer": "The PAYG Pro plan at $50/month includes 60,000 credits, which is 6 times more than Lite for only 5 times the price."
273
+ },
274
+ {
275
+ "question": "What is the credit-to-USD conversion rate?",
276
+ "answer": "1 credit = $0.001 USD, and $1 USD = 1,000 credits."
277
+ },
278
+ {
279
+ "question": "How are credits deducted for TTS generation?",
280
+ "answer": "Credits are deducted using the formula: credits_charged = character_count \u00d7 rate_per_char. Deduction happens before synthesis begins."
281
+ },
282
+ {
283
+ "question": "What happens if a user runs out of credits mid-request?",
284
+ "answer": "The API returns an HTTP 402 error if the user's balance is insufficient before synthesis starts."
285
+ },
286
+ {
287
+ "question": "Are top-up packages available and at what rate?",
288
+ "answer": "Yes, top-up packages are available at a uniform rate of $1 = 1,000 credits across all tiers, from $2 up to $100."
289
+ },
290
+ {
291
+ "question": "Do Pro plan users get a better top-up rate?",
292
+ "answer": "No, top-up packages use the same uniform rate for all plan tiers. Pro users benefit from their larger monthly credit allocation instead."
293
+ },
294
+ {
295
+ "question": "What is the credit rollover policy per plan?",
296
+ "answer": "Starter credits roll over for 30 days, Lite for 60 days, and Pro for 90 days. Top-up credits inherit the active plan's expiry window."
297
+ },
298
+ {
299
+ "question": "What are the API rate limits per plan?",
300
+ "answer": "Starter is limited to 5 requests per hour, Lite to 60 per hour, and Pro to 300 per hour."
301
+ },
302
+ {
303
+ "question": "How is TTS pricing exposed in the API versus the dashboard?",
304
+ "answer": "TTS is billed per character internally but exposed as tokens to API developers, where 1 NLP token equals approximately 4 characters. Both metrics appear in the dashboard."
305
+ },
306
+ {
307
+ "question": "What is the token rate for standard TTS in API terms?",
308
+ "answer": "Standard TTS costs 0.20 credits per token, which equals $0.20 per 1,000 tokens."
309
+ },
310
+ {
311
+ "question": "How much does it cost to generate a 2-minute news bulletin?",
312
+ "answer": "A 2-minute news bulletin is approximately 1,800 characters, costing 90 credits or $0.09 using standard TTS."
313
+ },
314
+ {
315
+ "question": "What does generating a full 10-hour audiobook cost?",
316
+ "answer": "A 10-hour audiobook is approximately 540,000 characters, costing around 27,000 credits or $27.00 using standard TTS."
317
+ },
318
+ {
319
+ "question": "Is the free Starter plan suitable for commercial use?",
320
+ "answer": "No. Free Starter plan audio output is watermarked and not licensed for commercial use. Upgrading to Lite or Pro removes the watermark and enables commercial rights."
321
+ },
322
+ {
323
+ "question": "What does voice cloning cost?",
324
+ "answer": "Voice cloning has a one-time training fee and an ongoing per-character generation rate charged each time the cloned voice is used for TTS."
325
+ },
326
+ {
327
+ "question": "Is voice cloning available on the Starter plan?",
328
+ "answer": "No, voice cloning is not available on the free Starter plan. It is available from the Lite plan upward."
329
+ },
330
+ {
331
+ "question": "How is audio processing for voice cloning charged?",
332
+ "answer": "A one-time non-refundable audio processing fee is charged at upload, ranging from free for audio under 30 seconds up to 80 credits for batches over 30 minutes."
333
+ },
334
+ {
335
+ "question": "What audio duration is free to process for voice cloning?",
336
+ "answer": "Audio up to 30 seconds is processed for free. Any audio longer than 30 seconds incurs a processing fee."
337
+ },
338
+ {
339
+ "question": "Are voice cloning processing fees refundable?",
340
+ "answer": "No, audio processing fees are non-refundable once the upload begins. Users must be informed before uploading."
341
+ },
342
+ {
343
+ "question": "How should the dashboard display credit usage?",
344
+ "answer": "The dashboard must always display both credits and the USD equivalent side by side, for example showing '50 cr ($0.05 USD)'."
345
+ },
346
+ {
347
+ "question": "What is the per-request character limit to prevent abuse?",
348
+ "answer": "A hard cap of 50,000 characters per API request is recommended to prevent runaway credit drain from malformed calls."
349
+ },
350
+ {
351
+ "question": "How does 9jaLingo pricing compare to competitors for standard TTS?",
352
+ "answer": "At $50 per million characters, 9jaLingo is positioned as a competitive mid-tier option, above Google Basic at $4/M but in line with other quality providers."
353
+ },
354
+ {
355
+ "question": "What is the infrastructure cost basis for 9jaLingo pricing?",
356
+ "answer": "Infrastructure runs on RunPod RTX 6000 Ada at approximately $2.60\u2013$3.00 per million characters, with pricing set to achieve a 230\u2013400% margin over that cost."
357
+ },
358
+ {
359
+ "question": "How should credit top-ups for the $100 package be recorded?",
360
+ "answer": "A $100 top-up delivers 100,000 credits at $0.001 per credit. Credits expire according to the user's active plan rollover policy."
361
+ },
362
+ {
363
+ "question": "What response code should be returned when a rate limit is exceeded?",
364
+ "answer": "The API should return HTTP 429 with a Retry-After header indicating when the user can make their next request."
365
+ },
366
+ {
367
+ "question": "What are your support response hours?",
368
+ "answer": "Support response times depend on your plan and request priority. For urgent issues, include clear impact details and request IDs to speed up handling."
369
+ },
370
+ {
371
+ "question": "What is the fastest way to get technical help?",
372
+ "answer": "Open a support ticket with reproducible steps, request IDs, timestamps, and logs. This gives the team enough detail to diagnose quickly."
373
+ },
374
+ {
375
+ "question": "Can I contact support for billing questions?",
376
+ "answer": "Yes. Support can help with billing, credits, usage questions, and plan guidance."
377
+ },
378
+ {
379
+ "question": "How do I escalate a production incident?",
380
+ "answer": "Create a ticket marked as production-impacting and include service impact, error rate, affected endpoints, and recent deployment context."
381
+ },
382
+ {
383
+ "question": "Can support help with API integration reviews?",
384
+ "answer": "Yes. The team can review common integration issues such as authentication, retries, payload structure, and endpoint usage patterns."
385
+ },
386
+ {
387
+ "question": "What details should I include when reporting latency issues?",
388
+ "answer": "Include endpoint name, average and peak latency, timestamp range, region, payload size, and any retry behavior observed."
389
+ },
390
+ {
391
+ "question": "Can I contact support for feature requests?",
392
+ "answer": "Yes. You can submit feature requests for voices, languages, APIs, or dashboard improvements, and the team will evaluate roadmap fit."
393
+ },
394
+ {
395
+ "question": "Where should I contact 9jaLingo support?",
396
+ "answer": "Use the official contact and support channels listed on the 9jaLingo website and product dashboard."
397
+ },
398
+ {
399
+ "question": "What is the 9jaLingo Python SDK?",
400
+ "answer": "The 9jaLingo Python SDK is the official client library that simplifies API usage for TTS, streaming, speaker operations, and related workflows."
401
+ },
402
+ {
403
+ "question": "How do I install the 9jaLingo SDK?",
404
+ "answer": "Install the SDK with pip from PyPI, then configure your API key in environment variables or client initialization."
405
+ },
406
+ {
407
+ "question": "Does the SDK support streaming audio output?",
408
+ "answer": "Yes. The SDK supports streaming audio chunks so you can start playback before full generation completes."
409
+ },
410
+ {
411
+ "question": "Can I choose response formats through the SDK?",
412
+ "answer": "Yes. The SDK supports multiple output formats such as WAV, PCM, MP3, FLAC, AAC, ALAC, and OGG, depending on endpoint support."
413
+ },
414
+ {
415
+ "question": "Does the SDK support voice cloning workflows?",
416
+ "answer": "Yes. The SDK includes voice cloning operations where supported, including sending approved reference audio and generating cloned speech."
417
+ },
418
+ {
419
+ "question": "Which Python versions are recommended for the SDK?",
420
+ "answer": "Use a modern supported Python version as documented in the SDK README, and keep dependencies updated for best compatibility."
421
+ },
422
+ {
423
+ "question": "What is included in the voice-over service?",
424
+ "answer": "The voice-over service supports script-to-audio generation, speaker selection, language control, and export-ready narration for media workflows."
425
+ },
426
+ {
427
+ "question": "Can I generate voice-overs for ads and explainer videos?",
428
+ "answer": "Yes. Voice-over workflows are suitable for ads, explainers, social media clips, podcasts, and educational content."
429
+ },
430
+ {
431
+ "question": "Can I keep a consistent narrator voice across episodes?",
432
+ "answer": "Yes. You can reuse the same speaker profile and generation settings to maintain a consistent narration style across episodes or campaigns."
433
+ },
434
+ {
435
+ "question": "How do I improve voice-over script quality before generation?",
436
+ "answer": "Use clear punctuation, natural sentence breaks, and pronunciation-friendly wording, then test short samples before full export."
437
+ },
438
+ {
439
+ "question": "Can teams automate voice-over generation in pipelines?",
440
+ "answer": "Yes. Teams can automate voice-over production using API calls, batch jobs, and post-processing in their content pipelines."
441
+ },
442
+ {
443
+ "question": "Which export formats are best for voice-over delivery?",
444
+ "answer": "WAV is best for mastering and post-production, while MP3 or AAC are common for lightweight distribution and web playback."
445
+ },
446
+ {
447
+ "question": "Where can I read 9jaLingo Terms of Service?",
448
+ "answer": "You can read the Terms of Service on the platform website at the terms page."
449
+ },
450
+ {
451
+ "question": "What does accepting the Terms mean?",
452
+ "answer": "By using 9jaLingo, you agree to follow the platform rules and usage guidelines."
453
+ },
454
+ {
455
+ "question": "Can my account be terminated for misuse?",
456
+ "answer": "Yes. Accounts that violate the Terms or misuse platform services may be suspended or terminated."
457
+ },
458
+ {
459
+ "question": "Can 9jaLingo update its Terms of Service?",
460
+ "answer": "Yes. Terms may be updated from time to time, and continued use indicates acceptance of revised terms."
461
+ },
462
+ {
463
+ "question": "Does 9jaLingo require lawful platform use?",
464
+ "answer": "Yes. Users are expected to use all platform features only for lawful and responsible purposes."
465
+ },
466
+ {
467
+ "question": "Where can I read the 9jaLingo Privacy Policy?",
468
+ "answer": "You can read the Privacy Policy on the platform website at the privacy-policy page."
469
+ },
470
+ {
471
+ "question": "What personal data does 9jaLingo collect?",
472
+ "answer": "The platform may collect account and technical data such as name, email, device information, IP address, and usage activity."
473
+ },
474
+ {
475
+ "question": "Does 9jaLingo sell personal data?",
476
+ "answer": "No. The privacy policy states that 9jaLingo does not sell personal information to third parties."
477
+ },
478
+ {
479
+ "question": "Does 9jaLingo use cookies?",
480
+ "answer": "Yes. Cookies are used for login persistence, user preferences, and performance analytics."
481
+ },
482
+ {
483
+ "question": "Can I request account or data deletion?",
484
+ "answer": "Yes. Users can request account deletion or data removal according to platform policy."
485
+ },
486
+ {
487
+ "question": "Where is the API documentation page on the website?",
488
+ "answer": "The API documentation is available on the /api-documentation page."
489
+ },
490
+ {
491
+ "question": "Is the API OpenAI-compatible?",
492
+ "answer": "Yes. The documentation describes OpenAI-compatible endpoint patterns for speech generation workflows."
493
+ },
494
+ {
495
+ "question": "What is the main TTS endpoint in the API docs?",
496
+ "answer": "The primary endpoint for speech generation is /v1/audio/speech."
497
+ },
498
+ {
499
+ "question": "Is there a dedicated streaming endpoint in the API docs?",
500
+ "answer": "Yes. The streaming endpoint is /v1/audio/speech/stream for progressive audio output."
501
+ },
502
+ {
503
+ "question": "Which language codes are shown in the API docs?",
504
+ "answer": "The docs reference language codes such as ha, ig, yo, and pcm."
505
+ },
506
+ {
507
+ "question": "How do I authenticate API calls from the docs examples?",
508
+ "answer": "Examples use Bearer token authentication with your API key in the Authorization header."
509
+ },
510
+ {
511
+ "question": "Does the API docs section show code examples in multiple languages?",
512
+ "answer": "Yes. The docs include examples in Python, JavaScript, and cURL."
513
+ },
514
+ {
515
+ "question": "Is there an official JavaScript SDK listed in the docs?",
516
+ "answer": "The API docs indicate there is no official JavaScript SDK yet and recommend direct REST usage."
517
+ },
518
+ {
519
+ "question": "Can I set generation controls like temperature and top_p through the API?",
520
+ "answer": "Yes. The examples show optional generation controls such as temperature, top_p, and repetition_penalty."
521
+ },
522
+ {
523
+ "question": "Can I pass a specific speaker ID in API requests?",
524
+ "answer": "Yes. Speaker IDs can be supplied in supported requests to control voice identity."
525
+ },
526
+ {
527
+ "question": "Does the frontend support email verification during auth?",
528
+ "answer": "Yes. The auth flow includes account/code verification pages."
529
+ },
530
+ {
531
+ "question": "Does 9jaLingo support 2FA verification in login flow?",
532
+ "answer": "Yes. The frontend includes a verify-2fa flow where users submit a one-time code."
533
+ },
534
+ {
535
+ "question": "Can users reset forgotten passwords from the frontend?",
536
+ "answer": "Yes. Forgot-password and reset-password routes are available for account recovery."
537
+ },
538
+ {
539
+ "question": "Is social authentication supported in the frontend?",
540
+ "answer": "Yes. The frontend includes Google and GitHub auth routes."
541
+ },
542
+ {
543
+ "question": "Is there a login error route in the auth system?",
544
+ "answer": "Yes. The frontend has a dedicated login-error route to handle authentication failures."
545
+ },
546
+ {
547
+ "question": "Can existing accounts be linked after social login?",
548
+ "answer": "Yes. The frontend includes a link-account flow for account linking scenarios."
549
+ },
550
+ {
551
+ "question": "Does the dashboard include API key management?",
552
+ "answer": "Yes. The dashboard includes a dedicated API Keys section."
553
+ },
554
+ {
555
+ "question": "Does the dashboard include usage analytics?",
556
+ "answer": "Yes. Usage analytics is available as a dashboard section for monitoring activity."
557
+ },
558
+ {
559
+ "question": "Can users manage subscriptions from the dashboard?",
560
+ "answer": "Yes. Subscription management is available in the dashboard modules."
561
+ },
562
+ {
563
+ "question": "Is there a support section inside the dashboard?",
564
+ "answer": "Yes. The dashboard includes a Support section for help-related tasks."
565
+ },
566
+ {
567
+ "question": "Is there a Voice Library section in the frontend dashboard?",
568
+ "answer": "Yes. A Voice Library section is present in the dashboard feature modules."
569
+ },
570
+ {
571
+ "question": "Can users access Speech-to-Text from the dashboard UI?",
572
+ "answer": "Yes. The frontend includes a dedicated Speech-to-Text section in dashboard space."
573
+ },
574
+ {
575
+ "question": "Can users access Text-to-Speech from the dashboard UI?",
576
+ "answer": "Yes. The dashboard provides a Text-to-Speech feature area."
577
+ },
578
+ {
579
+ "question": "Can users access Voice Cloning from the dashboard UI?",
580
+ "answer": "Yes. The dashboard includes a Voice Cloning section."
581
+ },
582
+ {
583
+ "question": "Does the frontend include a contact page?",
584
+ "answer": "Yes. The website includes a public contact page for inquiries."
585
+ },
586
+ {
587
+ "question": "Does the frontend include dedicated Terms and Privacy pages?",
588
+ "answer": "Yes. The site has separate /terms-of-service and /privacy-policy pages."
589
+ },
590
+ {
591
+ "question": "Can users add payment cards from the frontend flow?",
592
+ "answer": "Yes. The payment callback flow verifies card transactions and saves a payment method."
593
+ },
594
+ {
595
+ "question": "Which payment provider appears in the frontend card callback flow?",
596
+ "answer": "The callback flow references Paystack for card verification and setup."
597
+ },
598
+ {
599
+ "question": "Does the payment callback redirect users back to a subscription section?",
600
+ "answer": "Yes. After card processing, users are redirected to the dashboard subscription section with status parameters."
601
+ },
602
+ {
603
+ "question": "Can the frontend show payment setup success or failure status?",
604
+ "answer": "Yes. The callback flow sets payment status messages for success or error outcomes."
605
+ },
606
+ {
607
+ "question": "Does 9jaLingo support a LiveKit voice agent integration?",
608
+ "answer": "Yes. 9jaLingo supports a LiveKit-based real-time voice AI agent integration using a custom 9jaLingo TTS plugin."
609
+ },
610
+ {
611
+ "question": "Which command starts the LiveKit agent in development mode?",
612
+ "answer": "Run 'uv run agent.py dev' to start the LiveKit agent in development mode."
613
+ },
614
+ {
615
+ "question": "Can I test the LiveKit agent locally from terminal?",
616
+ "answer": "Yes. You can run 'uv run agent.py console' to test the agent in console mode without a full LiveKit room workflow."
617
+ },
618
+ {
619
+ "question": "What command is used for production agent startup?",
620
+ "answer": "Use 'uv run agent.py start' for production mode startup."
621
+ },
622
+ {
623
+ "question": "Does the LiveKit agent support Nigerian-language TTS through 9jaLingo server?",
624
+ "answer": "Yes. The agent uses the custom NaijaLingo TTS plugin and calls the 9jaLingo server endpoint for synthesis."
625
+ },
626
+ {
627
+ "question": "Which environment variables are required for 9jaLingo TTS in the LiveKit agent?",
628
+ "answer": "Set NAIJALINGO_BASE_URL, NAIJALINGO_SPEAKER, and NAIJALINGO_LANGUAGE, plus your API key and LiveKit/OpenAI credentials."
629
+ },
630
+ {
631
+ "question": "Can the LiveKit agent use telephony-optimized noise cancellation?",
632
+ "answer": "Yes. For telephony scenarios, you can switch to BVCTelephony noise cancellation for improved call audio handling."
633
+ },
634
+ {
635
+ "question": "Does the LiveKit integration support multilingual turn detection?",
636
+ "answer": "Yes. The integration includes multilingual turn-detection support for more natural conversational turn-taking."
637
+ },
638
+ {
639
+ "question": "Can I deploy the LiveKit agent to LiveKit Cloud?",
640
+ "answer": "Yes. You can deploy with the LiveKit CLI, which helps generate required deployment files and register the agent in LiveKit Cloud."
641
+ },
642
+ {
643
+ "question": "How do I verify the 9jaLingo TTS server health for LiveKit integration?",
644
+ "answer": "You can call the health endpoint on your configured base URL, for example '/v1/health', to confirm the TTS server is available."
645
+ }
646
+ ]
main.py ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from contextlib import asynccontextmanager
2
+ from datetime import datetime
3
+ from typing import Optional
4
+
5
+ import uvicorn
6
+ from fastapi import FastAPI, HTTPException, status
7
+ from pydantic import BaseModel, Field
8
+ from fastapi.middleware.cors import CORSMiddleware
9
+ from fastapi.responses import StreamingResponse
10
+
11
+ from src.chat_service import chat_service
12
+
13
+ class ChatRequest(BaseModel):
14
+ message: str = Field(..., min_length=1, max_length=4096, description="User input message")
15
+ thread_id: Optional[str] = Field(default="default", description="Conversation ID for memory tracking")
16
+
17
+ class ChatResponse(BaseModel):
18
+ response: str = Field(..., description="Assistant's response")
19
+ thread_id: str = Field(..., description="Conversation ID used for memory tracking")
20
+ timestamp: datetime = Field(default_factory=datetime.now)
21
+
22
+ class HealthResponse(BaseModel):
23
+ status: str = Field(..., description="Service status")
24
+ timestamp: datetime = Field(default_factory=datetime.now)
25
+
26
+ @asynccontextmanager
27
+ async def lifespan(app: FastAPI):
28
+ # Startup
29
+ print("Starting up the application...")
30
+ yield
31
+ # Shutdown
32
+ print("Shutting down the application...")
33
+
34
+ app = FastAPI(
35
+ title="9jaLingo RAG Chat API",
36
+ description="RAG API for interacting with the 9jaLingo support chatbot",
37
+ version="1.0.0",
38
+ lifespan=lifespan
39
+ )
40
+
41
+ app.add_middleware(
42
+ CORSMiddleware,
43
+ allow_origins=["*"],
44
+ allow_credentials=True,
45
+ allow_methods=["*"],
46
+ allow_headers=["*"],
47
+ )
48
+
49
+ @app.get("/health",
50
+ response_model=HealthResponse,
51
+ status_code=status.HTTP_200_OK,
52
+ tags=["Health"])
53
+ async def health_check():
54
+ """
55
+ Endpoint to check if the service is running.
56
+ Returns a 200 OK response if the service is healthy.
57
+ """
58
+ try:
59
+ return HealthResponse(
60
+ status="healthy",
61
+ timestamp=datetime.now()
62
+ )
63
+ except Exception as e:
64
+ raise HTTPException(
65
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
66
+ detail=f"Service health check failed: {str(e)}"
67
+ )
68
+
69
+ @app.post("/chat",
70
+ response_model=ChatResponse,
71
+ status_code=status.HTTP_200_OK,
72
+ tags=["Chat"])
73
+ async def chat_endpoint(request: ChatRequest):
74
+ try:
75
+ thread_id = request.thread_id or "default"
76
+ response = chat_service.chat(request.message, thread_id)
77
+
78
+ return ChatResponse(
79
+ response=response,
80
+ thread_id=thread_id,
81
+ timestamp=datetime.now()
82
+ )
83
+
84
+ except ValueError as ve:
85
+ raise HTTPException(
86
+ status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
87
+ detail=str(ve)
88
+ )
89
+ except Exception as e:
90
+ raise HTTPException(
91
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
92
+ detail=f"Error processing chat request: {str(e)}"
93
+ )
94
+
95
+
96
+ @app.post("/stream", tags=["Chat"])
97
+ async def stream_endpoint(request: ChatRequest):
98
+ try:
99
+ thread_id = request.thread_id or "default"
100
+
101
+ def generate():
102
+ yield from chat_service.stream(request.message, thread_id)
103
+
104
+ return StreamingResponse(generate(), media_type="text/plain")
105
+ except Exception as e:
106
+ raise HTTPException(
107
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
108
+ detail=f"Error streaming chat response: {str(e)}"
109
+ )
110
+
111
+ if __name__ == "__main__":
112
+ uvicorn.run(
113
+ "main:app",
114
+ host="0.0.0.0",
115
+ port=8000,
116
+ reload=True,
117
+ workers=1
118
+ )
pyproject.toml ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "9jalingo-bot"
3
+ version = "0.1.0"
4
+ description = "RAG/FastAPI support assistant for 9jaLingo"
5
+ readme = "Readme.md"
6
+ requires-python = ">=3.12"
7
+ dependencies = [
8
+ "langchain",
9
+ "langchain-core",
10
+ "python-dotenv",
11
+ "langchain-chroma",
12
+ "chromadb",
13
+ "fastapi[standard]",
14
+ "langchain-ollama",
15
+ ]
16
+
17
+ [build-system]
18
+ requires = ["hatchling>=1.25.0"]
19
+ build-backend = "hatchling.build"
20
+
21
+ [tool.hatch.build.targets.wheel]
22
+ packages = ["src"]
23
+
24
+ [tool.uv]
25
+ package = true
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ fastapi[standard]
2
+ langchain
3
+ langchain-core
4
+ langchain-chroma
5
+ chromadb
6
+ langchain-ollama
7
+ python-dotenv
src/__init__.py ADDED
File without changes
src/chat_service.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import os
4
+ from collections import defaultdict, deque
5
+ from collections.abc import Generator
6
+ from dataclasses import dataclass
7
+
8
+ from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage
9
+ from langchain_ollama import ChatOllama
10
+
11
+ from src.ingest import get_or_build_vectorstore
12
+
13
+
14
+ MAX_MEMORY_TURNS = int(os.getenv("RAG_MEMORY_TURNS", "6"))
15
+ LLM_MODEL = os.getenv("LLM_MODEL", "hf.co/LiquidAI/LFM2-1.2B-RAG-GGUF:Q5_K_M")
16
+
17
+
18
+ @dataclass
19
+ class MemoryTurn:
20
+ user_message: str
21
+ assistant_message: str
22
+
23
+
24
+ class ConversationMemory:
25
+ def __init__(self, max_turns: int = MAX_MEMORY_TURNS) -> None:
26
+ self._max_turns = max_turns
27
+ self._store: dict[str, deque[MemoryTurn]] = defaultdict(lambda: deque(maxlen=self._max_turns))
28
+
29
+ def append(self, conversation_id: str, user_message: str, assistant_message: str) -> None:
30
+ self._store[conversation_id].append(
31
+ MemoryTurn(user_message=user_message, assistant_message=assistant_message)
32
+ )
33
+
34
+ def format_history(self, conversation_id: str) -> str:
35
+ history = self._store.get(conversation_id)
36
+ if not history:
37
+ return "No previous conversation."
38
+
39
+ lines: list[str] = []
40
+ for turn in history:
41
+ lines.append(f"User: {turn.user_message}")
42
+ lines.append(f"Assistant: {turn.assistant_message}")
43
+ return "\n".join(lines)
44
+
45
+
46
+ class RagChatService:
47
+ def __init__(self, k: int = 4) -> None:
48
+ self._k = k
49
+ self._vectorstore = None
50
+ self._retriever = None
51
+ self._llm = None
52
+ self._memory = ConversationMemory()
53
+
54
+ def _get_retriever(self):
55
+ if self._retriever is None:
56
+ self._vectorstore = get_or_build_vectorstore()
57
+ self._retriever = self._vectorstore.as_retriever(search_kwargs={"k": self._k})
58
+ return self._retriever
59
+
60
+ def _get_llm(self) -> ChatOllama:
61
+ if self._llm is None:
62
+ self._llm = ChatOllama(model=LLM_MODEL, temperature=0.2)
63
+ return self._llm
64
+
65
+ def _format_context(self, question: str) -> str:
66
+ docs = self._get_retriever().invoke(question)
67
+ if not docs:
68
+ return "No relevant FAQ context found."
69
+ return "\n\n".join(doc.page_content for doc in docs)
70
+
71
+ def _build_messages(self, question: str, conversation_id: str) -> list[BaseMessage]:
72
+ history = self._memory.format_history(conversation_id)
73
+ context = self._format_context(question)
74
+ system_prompt = (
75
+ "You are a concise and helpful support assistant for 9jaLingo, a voice AI platform. "
76
+ "Use only the provided FAQ context and recent conversation history. "
77
+ "If the answer is not in the context, say that clearly and direct the user to official support.\n\n"
78
+ f"Conversation history:\n{history}\n\n"
79
+ f"FAQ context:\n{context}"
80
+ )
81
+ return [
82
+ SystemMessage(content=system_prompt),
83
+ HumanMessage(content=question),
84
+ ]
85
+
86
+ def chat(self, question: str, conversation_id: str) -> str:
87
+ messages = self._build_messages(question, conversation_id)
88
+ response = self._get_llm().invoke(messages)
89
+ answer = response.content if isinstance(response.content, str) else str(response.content)
90
+ self._memory.append(conversation_id, question, answer)
91
+ return answer
92
+
93
+ def stream(self, question: str, conversation_id: str) -> Generator[str, None, None]:
94
+ messages = self._build_messages(question, conversation_id)
95
+ parts: list[str] = []
96
+ for chunk in self._get_llm().stream(messages):
97
+ content = chunk.content if isinstance(chunk.content, str) else str(chunk.content)
98
+ if not content:
99
+ continue
100
+ parts.append(content)
101
+ yield content
102
+
103
+ self._memory.append(conversation_id, question, "".join(parts))
104
+
105
+
106
+ chat_service = RagChatService()
src/chatbot.py ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ chatbot.py — standalone RAG chain helpers for 9jaLingo FAQ chatbot.
3
+ """
4
+
5
+ from __future__ import annotations
6
+
7
+ import os
8
+ from operator import itemgetter
9
+
10
+ from langchain_core.output_parsers import StrOutputParser
11
+ from langchain_core.prompts import ChatPromptTemplate
12
+ from langchain_core.runnables import RunnableParallel
13
+ from langchain_ollama import ChatOllama
14
+
15
+ from src.ingest import get_or_build_vectorstore
16
+
17
+
18
+ LLM_MODEL = os.getenv("LLM_MODEL", "hf.co/LiquidAI/LFM2-1.2B-RAG-GGUF:Q5_K_M")
19
+
20
+ SYSTEM_PROMPT = """You are a friendly and knowledgeable support assistant for 9jaLingo,
21
+ a voice AI platform for African language speech products.
22
+
23
+ Answer the user's question using ONLY the context provided below.
24
+ If the context does not contain enough information to answer, say so politely
25
+ and suggest the user visit https://www.9jalingo.org or contact support.
26
+
27
+ Context:
28
+ {context}
29
+ """
30
+
31
+
32
+ def _format_docs(docs) -> str: # type: ignore[type-arg]
33
+ return "\n\n".join(doc.page_content for doc in docs)
34
+
35
+
36
+ def build_rag_chain(k: int = 4):
37
+ vectorstore = get_or_build_vectorstore()
38
+ retriever = vectorstore.as_retriever(search_kwargs={"k": k})
39
+
40
+ llm = ChatOllama(model=LLM_MODEL, temperature=0.2)
41
+ prompt = ChatPromptTemplate.from_messages(
42
+ [
43
+ ("system", SYSTEM_PROMPT),
44
+ ("human", "{question}"),
45
+ ]
46
+ )
47
+
48
+ setup = RunnableParallel(
49
+ context=itemgetter("question") | retriever | _format_docs,
50
+ question=itemgetter("question"),
51
+ )
52
+
53
+ return setup | prompt | llm | StrOutputParser()
54
+
55
+
56
+ def stream_rag_chain(question: str, k: int = 4):
57
+ chain = build_rag_chain(k=k)
58
+ yield from chain.stream({"question": question})
src/ingest.py ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ingest.py — Load FAQ JSON, create LangChain Documents, and store
3
+ embeddings in a local ChromaDB collection.
4
+
5
+ Run directly to (re)build the vector store:
6
+ python -m src.ingest
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import json
12
+ import os
13
+ from pathlib import Path
14
+
15
+ from langchain_core.documents import Document
16
+ from langchain_chroma import Chroma
17
+ from langchain_ollama import OllamaEmbeddings
18
+
19
+ # Paths: keep knowledge data and vector store under rag/
20
+ _HERE = Path(__file__).parent
21
+ _RAG_DIR = _HERE.parent / "data"
22
+ FAQ_PATH = _RAG_DIR / "faq.json"
23
+ CHROMA_DIR = _RAG_DIR / "chroma_db"
24
+
25
+ EMBED_MODEL = os.getenv("EMBED_MODEL", "embeddinggemma:latest")
26
+ COLLECTION_NAME = "naijalingo_faq"
27
+
28
+
29
+ def load_faq_documents(faq_path: Path = FAQ_PATH) -> list[Document]:
30
+ with open(faq_path, encoding="utf-8") as f:
31
+ items = json.load(f)
32
+
33
+ docs: list[Document] = []
34
+ for i, item in enumerate(items):
35
+ question = item.get("question", "").strip()
36
+ answer = item.get("answer", "").strip()
37
+ content = f"Question: {question}\nAnswer: {answer}"
38
+ docs.append(
39
+ Document(
40
+ page_content=content,
41
+ metadata={"source": "faq.json", "index": i, "question": question},
42
+ )
43
+ )
44
+ return docs
45
+
46
+
47
+ def build_vectorstore(
48
+ faq_path: Path = FAQ_PATH,
49
+ chroma_dir: Path = CHROMA_DIR,
50
+ embed_model: str = EMBED_MODEL,
51
+ ) -> Chroma:
52
+ docs = load_faq_documents(faq_path)
53
+ embeddings = OllamaEmbeddings(model=embed_model)
54
+
55
+ chroma_dir.mkdir(parents=True, exist_ok=True)
56
+ vectorstore = Chroma.from_documents(
57
+ documents=docs,
58
+ embedding=embeddings,
59
+ collection_name=COLLECTION_NAME,
60
+ persist_directory=str(chroma_dir),
61
+ )
62
+ print(f"[ingest] Indexed {len(docs)} FAQ entries -> {chroma_dir}")
63
+ return vectorstore
64
+
65
+
66
+ def load_vectorstore(
67
+ chroma_dir: Path = CHROMA_DIR,
68
+ embed_model: str = EMBED_MODEL,
69
+ ) -> Chroma:
70
+ embeddings = OllamaEmbeddings(model=embed_model)
71
+ return Chroma(
72
+ collection_name=COLLECTION_NAME,
73
+ embedding_function=embeddings,
74
+ persist_directory=str(chroma_dir),
75
+ )
76
+
77
+
78
+ def get_or_build_vectorstore() -> Chroma:
79
+ if CHROMA_DIR.exists() and any(CHROMA_DIR.iterdir()):
80
+ print("[ingest] Loading existing vector store from disk...")
81
+ return load_vectorstore()
82
+ print("[ingest] Building vector store for the first time...")
83
+ return build_vectorstore()
84
+
85
+
86
+ if __name__ == "__main__":
87
+ build_vectorstore()
uv.lock ADDED
The diff for this file is too large to render. See raw diff