nexusbert commited on
Commit
e03d777
·
1 Parent(s): f72bb28
SYSTEM_OVERVIEW.md ADDED
@@ -0,0 +1,568 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # TerraSyncra AI – Product & System Overview
2
+
3
+ ## 1. Product Introduction
4
+
5
+ **TerraSyncra** is a multilingual agricultural intelligence agent designed specifically for Nigerian (and African) farmers. It provides comprehensive agricultural support through AI-powered assistance.
6
+
7
+ **Key Capabilities:**
8
+ - **Agricultural Q&A**: Answers questions about crops, livestock, soil, weather, pests, and diseases in multiple languages
9
+ - **Soil Analysis**: Provides expert soil health assessments from lab reports and field data using Gemini 3 Flash
10
+ - **Disease Detection**: Identifies plant and animal diseases from images, text descriptions, or voice input using Gemini 2.5 Flash
11
+ - **Live Agricultural Updates**: Delivers real-time weather information and agricultural news through RAG (Retrieval-Augmented Generation)
12
+ - **Live Voice Interaction**: Supports real-time voice conversations via WebSocket in local languages (Igbo, Hausa, Yoruba, English)
13
+
14
+ **Developer**: Ifeanyi Amogu Shalom
15
+ **Target Users**: Farmers, agronomists, agricultural extension officers, and agricultural support workers in Nigeria and similar contexts
16
+
17
+ ---
18
+
19
+ ## 2. Problem Statement
20
+
21
+ Nigerian smallholder farmers face significant challenges:
22
+
23
+ ### 2.1 Limited Access to Agricultural Experts
24
+ - **Scarcity of agronomists and veterinarians** relative to the large farming population
25
+ - **Geographic barriers** preventing farmers from accessing expert advice
26
+ - **High consultation costs** that many smallholder farmers cannot afford
27
+ - **Long waiting times** for professional consultations, especially during critical periods (disease outbreaks, planting seasons)
28
+
29
+ ### 2.2 Language Barriers
30
+ - Most agricultural information and resources are in **English**, while many farmers primarily speak **Hausa, Igbo, or Yoruba**
31
+ - **Technical terminology** is not easily accessible in local languages
32
+ - **Translation services** are often unavailable or unreliable
33
+
34
+ ### 2.3 Fragmented Information Sources
35
+ - Weather data, soil reports, disease information, and market prices are scattered across different platforms
36
+ - **No unified system** to integrate and interpret multiple data sources
37
+ - **Information overload** without proper context or prioritization
38
+
39
+ ### 2.4 Time-Sensitive Decision Making
40
+ - **Disease outbreaks** require immediate identification and treatment
41
+ - **Weather changes** affect planting, harvesting, and irrigation decisions
42
+ - **Pest attacks** can devastate crops if not addressed quickly
43
+ - **Delayed responses** lead to significant economic losses
44
+
45
+ ### 2.5 Solution Approach
46
+ TerraSyncra addresses these challenges by providing:
47
+ - **Fast, AI-powered responses** available 24/7
48
+ - **Multilingual support** (English, Igbo, Hausa, Yoruba)
49
+ - **Integrated intelligence** combining expert models, RAG, and live data
50
+ - **Accessible interface** via text, voice, and image inputs
51
+ - **Professional consultation reminders** to ensure farmers seek expert confirmation when needed
52
+
53
+ ---
54
+
55
+ ## 3. System Architecture & Request Flows
56
+
57
+ ### 3.1 General Agricultural Q&A – `POST /ask`
58
+
59
+ **Step-by-Step Process:**
60
+
61
+ 1. **Input Reception**
62
+ - User sends `query` (text) with optional `session_id` for conversation continuity
63
+
64
+ 2. **Language Detection**
65
+ - FastText model (`facebook/fasttext-language-identification`) detects input language
66
+ - Supports: English, Igbo, Hausa, Yoruba
67
+
68
+ 3. **Translation (if needed)**
69
+ - If language ≠ English, translates to English using NLLB (`drrobot9/nllb-ig-yo-ha-finetuned`)
70
+ - Preserves original language for back-translation
71
+
72
+ 4. **Intent Detection**
73
+ - Classifies query into categories:
74
+ - **Weather question**: Requests weather information (with/without Nigerian state)
75
+ - **Live update**: Requests current agricultural news or updates
76
+ - **Normal question**: General agricultural Q&A
77
+ - **Low confidence**: Falls back to RAG when intent is unclear
78
+
79
+ 5. **Context Building**
80
+ - **Weather intent**: Calls WeatherAPI for state-specific weather data, embeds summary into context
81
+ - **Live update intent**: Queries live FAISS vectorstore index for latest agricultural documents
82
+ - **Low confidence**: Falls back to static FAISS index for safer, more general responses
83
+
84
+ 6. **Conversation Memory**
85
+ - Loads per-session history from `MemoryStore` (TTL cache, 1-hour expiration)
86
+ - Trims to `MAX_HISTORY_MESSAGES` (default: 30) to prevent context overflow
87
+
88
+ 7. **Expert Model Generation**
89
+ - Uses **Qwen/Qwen1.5-1.8B** (finetuned for Nigerian agriculture)
90
+ - Loaded lazily via `model_manager` (CPU-optimized, first-use loading)
91
+ - Builds chat messages: system prompt + conversation history + current user message + context
92
+ - System prompt restricts responses to **agriculture/farming topics only**
93
+ - Generates bounded-length answer (reduced token limit: 400 tokens for general, 256 for weather)
94
+ - Cleans response to remove any "Human: / Assistant:" style example continuations
95
+
96
+ 8. **Back-Translation**
97
+ - If original language ≠ English, translates answer back to user's language using NLLB
98
+
99
+ 9. **Response**
100
+ - Returns JSON: `{ query, answer, session_id, detected_language }`
101
+
102
+ **Safety & Focus:**
103
+ - System prompt enforces agriculture-only topic handling
104
+ - Unrelated questions are redirected back to farming topics
105
+ - Response cleaning prevents off-topic example continuations
106
+
107
+ ---
108
+
109
+ ### 3.2 Soil Analysis – `POST /analyze-soil`
110
+
111
+ **Step-by-Step Process:**
112
+
113
+ 1. **Input Reception**
114
+ - `report_data`: Text description of soil report or lab results (required)
115
+ - Optional fields: `location`, `crop_type`, `field_size`, `previous_crops`, `additional_notes`
116
+
117
+ 2. **Agent Processing**
118
+ - `soil_agent.analyze_soil()` builds comprehensive prompt with:
119
+ - Soil report data
120
+ - Field information (location, crop type, size, history)
121
+ - Regional context (Nigerian states, climate patterns)
122
+
123
+ 3. **Gemini API Call**
124
+ - Model: `GEMINI_SOIL_MODEL = "gemini-3-flash-preview"`
125
+ - Prompt style: Brief, direct, actionable
126
+ - Focuses on:
127
+ - Current soil condition (short summary)
128
+ - Key nutrient issues (deficiencies or excesses)
129
+ - 1–3 best crops for this soil type
130
+ - Clear fertilizer and amendment recommendations
131
+ - Simple soil improvement steps
132
+
133
+ 4. **Output**
134
+ - JSON response: `{ success, analysis, model_used }`
135
+
136
+ **Important Note:**
137
+ > Soil analysis is **advisory only** – not a formal agronomy diagnosis. The UI should encourage farmers to confirm with a local agronomist or extension officer for critical decisions.
138
+
139
+ ---
140
+
141
+ ### 3.3 Disease Detection
142
+
143
+ #### 3.3.1 Image-Based Detection – `POST /detect-disease-image`
144
+
145
+ **Step-by-Step Process:**
146
+
147
+ 1. **Input Reception**
148
+ - Image file (JPEG, PNG, etc.)
149
+ - Optional `query`: Text description or question
150
+
151
+ 2. **Agent Processing**
152
+ - `disease_agent.classify_disease_from_image()` processes:
153
+ - Image bytes + MIME type
154
+ - User query (if provided)
155
+ - Builds structured prompt for Gemini
156
+
157
+ 3. **Gemini API Call**
158
+ - Model: `GEMINI_DISEASE_MODEL = "gemini-2.5-flash"`
159
+ - Prompt instructs Gemini to provide:
160
+ - Disease name (scientific + common name) in 1 short line
161
+ - **Threat level: Low / Moderate / High / Uncertain** (MANDATORY)
162
+ - 2–3 key symptoms visible in image
163
+ - 2–3 clear treatment steps (bullets)
164
+ - 1–2 simple prevention tips
165
+ - Brief, direct language with short sentences
166
+
167
+ 4. **Backend Safety Enforcement**
168
+ - Backend **always appends** disclaimer:
169
+ > "IMPORTANT: This threat level is an estimate based only on the image/description. For an accurate diagnosis and treatment plan, please consult a qualified agronomist, veterinary doctor, or local agricultural extension officer."
170
+
171
+ 5. **Output**
172
+ - JSON response: `{ success, classification, model_used, input_type }`
173
+
174
+ #### 3.3.2 Text/Voice-Based Detection – `POST /detect-disease-text`
175
+
176
+ **Step-by-Step Process:**
177
+
178
+ 1. **Input Reception**
179
+ - `description`: Text description of disease symptoms or condition
180
+ - `language`: Language code (en, ig, ha, yo)
181
+
182
+ 2. **Agent Processing**
183
+ - `disease_agent.classify_disease_from_text()` processes:
184
+ - Text description
185
+ - Language context
186
+ - Builds structured prompt for Gemini
187
+
188
+ 3. **Gemini API Call**
189
+ - Same model and prompt structure as image-based detection
190
+ - Threat level assessment based on described symptoms
191
+
192
+ 4. **Backend Safety Enforcement**
193
+ - Same disclaimer appended as image-based detection
194
+
195
+ 5. **Output**
196
+ - JSON response: `{ success, classification, model_used, input_type }`
197
+
198
+ **Threat Level Guidelines:**
199
+ - **Low**: Mild or early-stage issue, unlikely to cause major losses if addressed soon
200
+ - **Moderate**: Noticeable risk that can reduce yield/health if not treated
201
+ - **High**: Serious or fast-spreading issue that can cause major losses or death (use cautiously, only when clearly severe)
202
+ - **Uncertain**: Insufficient or ambiguous data; model cannot safely rate risk (encouraged when not confident)
203
+
204
+ ---
205
+
206
+ ### 3.4 Live Voice Interaction – `WS /live-voice` & `POST /live-voice-start`
207
+
208
+ **Step-by-Step Process:**
209
+
210
+ 1. **WebSocket Connection**
211
+ - Client connects to `/live-voice` endpoint
212
+ - Optional: Send image as JSON (base64 encoded) at session start
213
+ - Audio chunks streamed as raw PCM bytes (16kHz, mono, 16-bit)
214
+
215
+ 2. **Agent Processing**
216
+ - `live_voice_agent.handle_live_voice_websocket()` manages:
217
+ - WebSocket connection lifecycle
218
+ - Image context (if provided)
219
+ - Audio streaming to Gemini Live API
220
+ - Audio response streaming back to client
221
+
222
+ 3. **Gemini Live API**
223
+ - Model: `gemini-2.5-flash` via Gemini Live API
224
+ - System prompt: Brief, clear, focused on "what to do next" (2–4 key steps)
225
+ - Supports: Disease detection, soil analysis, general farming, weather
226
+ - Prefers short sentences and bullet points
227
+
228
+ 4. **Response Streaming**
229
+ - Audio responses streamed back as PCM bytes
230
+ - Optional JSON messages for status/transcripts
231
+
232
+ 5. **Safety Expectations**
233
+ - Same professional advice principle applies
234
+ - Frontends should display clear "not a replacement for a professional" banner
235
+
236
+ ---
237
+
238
+ ## 4. Technologies Used
239
+
240
+ ### 4.1 Backend Framework & Infrastructure
241
+ - **FastAPI**: Modern Python web framework for building REST APIs and WebSocket endpoints
242
+ - **Uvicorn**: ASGI server for running FastAPI applications
243
+ - **Python 3.10**: Programming language
244
+ - **Docker**: Containerization for deployment
245
+ - **Hugging Face Spaces**: Deployment platform (Docker runtime, CPU-only environment)
246
+
247
+ ### 4.2 Core Language Models
248
+
249
+ #### 4.2.1 Expert Model: Qwen/Qwen1.5-1.8B
250
+ - **Model**: `Qwen/Qwen1.5-1.8B` (via Hugging Face Transformers)
251
+ - **Purpose**: Primary agricultural Q&A and conversation
252
+ - **Specialization**: **Finetuned/specialized** for Nigerian agricultural context through:
253
+ - Custom system prompts focused on Nigerian farming practices
254
+ - Domain-specific training data integration
255
+ - Response formatting optimized for agricultural advice
256
+ - **Optimization**:
257
+ - Lazy loading via `model_manager` (loads on first use)
258
+ - CPU-optimized inference (float32, device_map="cpu")
259
+ - Reduced token limits to prevent over-generation
260
+
261
+ #### 4.2.2 Gemini Models (Google AI)
262
+ - **google-genai**: Official Python client for Google's Gemini API
263
+ - **gemini-3-flash-preview**: Used for soil analysis
264
+ - **gemini-2.5-flash**: Used for disease detection and live voice interaction
265
+ - **API Version**: v1alpha for advanced features (disease detection, live voice)
266
+
267
+ ### 4.3 Retrieval-Augmented Generation (RAG)
268
+
269
+ - **LangChain**: Framework for building LLM applications
270
+ - **LangChain Community**: Community integrations and tools
271
+ - **SentenceTransformers**:
272
+ - Model: `paraphrase-multilingual-MiniLM-L12-v2`
273
+ - Purpose: Text embeddings for semantic search
274
+ - **FAISS (Facebook AI Similarity Search)**:
275
+ - Vector database for efficient similarity search
276
+ - Two indices: Static (general knowledge) and Live (current updates)
277
+ - **APScheduler**: Background job scheduler for periodic RAG updates
278
+
279
+ ### 4.4 Language Processing
280
+
281
+ - **FastText**:
282
+ - Model: `facebook/fasttext-language-identification`
283
+ - Purpose: Language detection (English, Igbo, Hausa, Yoruba)
284
+ - **NLLB (No Language Left Behind)**:
285
+ - Model: `drrobot9/nllb-ig-yo-ha-finetuned`
286
+ - Purpose: Translation between English and Nigerian languages (Hausa, Igbo, Yoruba)
287
+ - Bidirectional translation support
288
+
289
+ ### 4.5 External APIs & Data Sources
290
+
291
+ - **WeatherAPI**:
292
+ - Provides state-level weather data for Nigerian states
293
+ - Real-time weather information integration
294
+ - **AgroNigeria / HarvestPlus**:
295
+ - Agricultural news feeds for RAG updates
296
+ - News scraping and processing
297
+
298
+ ### 4.6 Additional Libraries
299
+
300
+ - **transformers**: Hugging Face library for loading and using transformer models
301
+ - **torch**: PyTorch (CPU-optimized version)
302
+ - **numpy**: Numerical computing
303
+ - **requests**: HTTP library for API calls
304
+ - **beautifulsoup4**: Web scraping for news aggregation
305
+ - **python-multipart**: File upload support for FastAPI
306
+ - **python-dotenv**: Environment variable management
307
+
308
+ ---
309
+
310
+ ## 5. Threat Level & Safety Policy
311
+
312
+ ### 5.1 Domain Scope
313
+ - **Plant and animal diseases only** – **NOT human health**
314
+ - Focuses on agricultural and veterinary contexts
315
+ - Does not provide medical advice for humans
316
+
317
+ ### 5.2 Threat Level Categories
318
+
319
+ #### Low
320
+ - **Definition**: Mild or early-stage issue, unlikely to cause major losses if addressed soon
321
+ - **Characteristics**:
322
+ - Localized symptoms
323
+ - Slow progression
324
+ - Easily manageable with standard treatments
325
+ - **Example**: Minor leaf spots, early nutrient deficiency
326
+
327
+ #### Moderate
328
+ - **Definition**: Noticeable risk that can reduce yield/health if not treated
329
+ - **Characteristics**:
330
+ - Moderate spread or impact
331
+ - Requires timely intervention
332
+ - Can cause economic losses if ignored
333
+ - **Example**: Moderate pest infestation, developing fungal infection
334
+
335
+ #### High
336
+ - **Definition**: Serious or fast-spreading issue that can cause major losses or death
337
+ - **Characteristics**:
338
+ - Rapid spread or severe symptoms
339
+ - High potential for significant economic impact
340
+ - May require immediate professional intervention
341
+ - **Example**: Severe bacterial blight, fast-spreading viral disease
342
+ - **Usage Caution**: Only assigned when signs are **clearly severe** or fast-spreading
343
+
344
+ #### Uncertain
345
+ - **Definition**: Insufficient or ambiguous data; model cannot safely rate risk
346
+ - **Characteristics**:
347
+ - Unclear symptoms
348
+ - Multiple possible diagnoses
349
+ - Poor image quality or vague description
350
+ - **Usage**: Encouraged when model is not confident – **better to be uncertain than wrong**
351
+
352
+ ### 5.3 Accuracy & Caution Approach
353
+
354
+ **Threat Level Assessment:**
355
+ - Based **only** on image + description – **no lab tests or physical examination**
356
+ - Prompts instruct Gemini to be **conservative and cautious**
357
+ - Model encouraged to use `Uncertain` when not clearly sure
358
+ - Final responses always embed a strong "consult professionals" reminder
359
+
360
+ **Professional Consultation Reminder:**
361
+ - Backend **always appends** disclaimer to disease detection responses
362
+ - Frontends should visually emphasize: "This is not a medical/veterinary/agronomic diagnosis"
363
+ - System is a **decision-support tool**, not a definitive diagnostic engine
364
+
365
+ **Important Note:**
366
+ > **This system is a decision-support tool, not a definitive diagnosis engine.**
367
+ > All disease/threat outputs must be treated as preliminary guidance only.
368
+ > Farmers should always consult qualified professionals for critical decisions.
369
+
370
+ ---
371
+
372
+ ## 6. Limitations & Issues Faced
373
+
374
+ ### 6.1 Diagnostic Limitations
375
+
376
+ #### Input Quality Dependencies
377
+ - **Image Quality**: Blurry, poorly lit, or low-resolution images reduce accuracy
378
+ - **Description Clarity**: Vague or incomplete symptom descriptions limit diagnostic precision
379
+ - **Context Missing**: Lack of field history, crop variety, or environmental conditions affects recommendations
380
+
381
+ #### Inherent Limitations
382
+ - **No Physical Examination**: Cannot inspect internal plant structures or perform lab tests
383
+ - **No Real-Time Monitoring**: Cannot track disease progression over time
384
+ - **Regional Variations**: Some regional diseases may be under-represented in training data
385
+ - **Seasonal Factors**: Disease presentation may vary by season, which may not always be captured
386
+
387
+ ### 6.2 Language & Translation Challenges
388
+
389
+ #### Translation Accuracy
390
+ - **NLLB Limitations**: Can misread slang, mixed-language (e.g., Pidgin + Hausa), or regional dialects
391
+ - **Technical Terminology**: Agricultural terms may not have direct translations, leading to approximations
392
+ - **Context Loss**: Subtle meaning can be lost across translation steps (user language → English → user language)
393
+
394
+ #### Language Detection
395
+ - **FastText Edge Cases**: May misclassify mixed-language inputs or code-switching
396
+ - **Dialect Variations**: Regional variations within languages may not be fully captured
397
+
398
+ ### 6.3 Model Behavior Issues
399
+
400
+ #### Hallucination Risk
401
+ - **Qwen/Gemini Limitations**: Can generate confident but incorrect answers
402
+ - **Mitigations Applied**:
403
+ - Stricter system prompts with domain restrictions
404
+ - Shorter output limits (400 tokens for general, 256 for weather)
405
+ - Response cleaning to remove example continuations
406
+ - Topic redirection for unrelated questions
407
+ - **Not Bulletproof**: Hallucination can still occur, especially for edge cases
408
+
409
+ #### Response Drift
410
+ - **Off-Topic Continuations**: Models may continue with example conversations or unrelated content
411
+ - **Mitigation**: Response cleaning logic removes "Human: / Assistant:" patterns and unrelated content
412
+
413
+ ### 6.4 Latency & Compute Constraints
414
+
415
+ #### First-Request Latency
416
+ - **Model Loading**: First Qwen/NLLB call is slower due to model + weights loading on CPU
417
+ - **Cold Start**: ~5-10 seconds for first request after deployment
418
+ - **Subsequent Requests**: Faster due to cached models in memory
419
+
420
+ #### CPU-Only Environment
421
+ - **Inference Speed**: CPU inference is slower than GPU (acceptable for Hugging Face Spaces CPU tier)
422
+ - **Memory Constraints**: Limited RAM requires careful model management (lazy loading, model caching)
423
+
424
+ ### 6.5 External Dependencies
425
+
426
+ #### WeatherAPI Issues
427
+ - **Outages**: WeatherAPI downtime affects weather-related responses
428
+ - **Rate Limits**: API quota limits may restrict frequent requests
429
+ - **Data Accuracy**: Weather data quality depends on third-party provider
430
+
431
+ #### News Source Reliability
432
+ - **Scraping Fragility**: News sources may change HTML structure, breaking scrapers
433
+ - **Update Frequency**: RAG updates are scheduled; failures can cause stale information
434
+ - **Content Quality**: News article quality and relevance vary
435
+
436
+ ### 6.6 RAG & Data Freshness
437
+
438
+ #### Update Scheduling
439
+ - **Periodic Updates**: RAG indices updated on schedule (not real-time)
440
+ - **Job Failures**: If update job fails, index can lag behind real-world events
441
+ - **Index Rebuilding**: Full index rebuilds can be time-consuming
442
+
443
+ #### Vectorstore Limitations
444
+ - **Embedding Quality**: Semantic search quality depends on embedding model performance
445
+ - **Retrieval Accuracy**: Retrieved documents may not always be most relevant
446
+ - **Context Window**: Limited context window may truncate important information
447
+
448
+ ### 6.7 Deployment & Infrastructure
449
+
450
+ #### Hugging Face Spaces Constraints
451
+ - **CPU-Only**: No GPU acceleration available
452
+ - **Memory Limits**: Limited RAM requires optimization (lazy loading, model size reduction)
453
+ - **Build Time**: Docker builds can be slow, especially with large dependencies
454
+ - **Cold Starts**: Spaces may spin down after inactivity, causing cold start delays
455
+
456
+ #### Docker Build Issues
457
+ - **Dependency Conflicts**: Some Python packages may conflict (e.g., pyaudio requiring system libraries)
458
+ - **Build Timeouts**: Long build times may cause deployment failures
459
+ - **Cache Management**: Docker layer caching can be inconsistent
460
+
461
+ ---
462
+
463
+ ## 7. Recommended UX & Safety Reminders
464
+
465
+ ### 7.1 Visual Disclaimers
466
+
467
+ **Always display a clear banner near disease/soil results:**
468
+
469
+ > "⚠️ **This is AI-generated guidance. Always confirm with a local agronomist, veterinary doctor, or agricultural extension officer before taking major actions.**"
470
+
471
+ ### 7.2 Threat Level Display
472
+
473
+ - **Visual Highlighting**: Display threat level prominently with color coding:
474
+ - 🟢 **Low**: Green
475
+ - 🟡 **Moderate**: Yellow
476
+ - 🔴 **High**: Red
477
+ - ⚪ **Uncertain**: Gray
478
+ - **Tooltips**: Provide explanations for each threat level
479
+ - **Always Pair with Disclaimer**: Never show threat level without the professional consultation reminder
480
+
481
+ ### 7.3 Call-to-Action Buttons
482
+
483
+ Provide quick access to professional help:
484
+ - **"Contact an Extension Officer"** button/link
485
+ - **"Find a Vet/Agronomist Near You"** button/link
486
+ - **"Schedule a Consultation"** option (if available)
487
+
488
+ ### 7.4 Response Quality Indicators
489
+
490
+ - Show **confidence indicators** when available (e.g., "High confidence" vs "Uncertain")
491
+ - Display **input quality warnings** (e.g., "Image quality may affect accuracy")
492
+ - Provide **feedback mechanisms** for users to report incorrect diagnoses
493
+
494
+ ### 7.5 Language Support
495
+
496
+ - Clearly indicate **detected language** in responses
497
+ - Provide **language switcher** for users to change language preference
498
+ - Show **translation quality warnings** if translation may be approximate
499
+
500
+ ---
501
+
502
+ ## 8. System Summary
503
+
504
+ ### 8.1 Problem Addressed
505
+
506
+ Nigerian smallholder farmers face critical challenges:
507
+ - **Limited access to agricultural experts** (agronomists, veterinarians)
508
+ - **Language barriers** (most resources in English, farmers speak Hausa/Igbo/Yoruba)
509
+ - **Fragmented information sources** (weather, soil, disease data scattered)
510
+ - **Time-sensitive decision making** (disease outbreaks, weather changes, pest attacks)
511
+
512
+ ### 8.2 Solution Provided
513
+
514
+ TerraSyncra combines multiple AI technologies to provide:
515
+ - **Fast, 24/7 AI-powered responses** in multiple languages
516
+ - **Integrated intelligence**:
517
+ - **Finetuned Qwen 1.8B** expert model for agricultural Q&A
518
+ - **Gemini 3/2.5 Flash** for soil analysis and disease detection
519
+ - **RAG + Weather + News** for live, contextual information
520
+ - **CPU-optimized, multilingual backend** (FastAPI on Hugging Face Spaces)
521
+ - **Multiple input modalities**: Text, voice, and image support
522
+
523
+ ### 8.3 Safety & Professional Consultation
524
+
525
+ **Every disease assessment includes:**
526
+ - Explicit **Threat level** (Low / Moderate / High / Uncertain)
527
+ - Clear **professional consultation reminder**
528
+ - Emphasis that threat levels are **estimates**, not definitive diagnoses
529
+
530
+ ### 8.4 Key Technologies
531
+
532
+ - **Expert Model**: Qwen/Qwen1.5-1.8B (finetuned for Nigerian agriculture)
533
+ - **Gemini Models**: gemini-3-flash-preview (soil), gemini-2.5-flash (disease, voice)
534
+ - **RAG**: LangChain + FAISS + SentenceTransformers
535
+ - **Language Processing**: FastText (detection) + NLLB (translation)
536
+ - **Backend**: FastAPI + Uvicorn + Docker
537
+ - **Deployment**: Hugging Face Spaces (CPU-optimized)
538
+
539
+ ### 8.5 Developer & Credits
540
+
541
+ **Developer**: Ifeanyi Amogu Shalom
542
+ **Intended Users**: Farmers, agronomists, agricultural extension officers, and agricultural support workers in Nigeria and similar contexts
543
+
544
+ ---
545
+
546
+ ## 9. Future Improvements & Roadmap
547
+
548
+ ### 9.1 Potential Enhancements
549
+
550
+ - **Model Fine-tuning**: Further fine-tune Qwen on Nigerian agricultural datasets
551
+ - **Multi-modal RAG**: Integrate images into RAG for visual similarity search
552
+ - **Offline Mode**: Support for offline operation in areas with poor connectivity
553
+ - **Mobile App**: Native mobile applications for better user experience
554
+ - **Expert Network Integration**: Direct connection to network of agronomists/veterinarians
555
+ - **Historical Tracking**: Track disease progression and treatment outcomes over time
556
+
557
+ ### 9.2 Technical Improvements
558
+
559
+ - **Response Caching**: Cache common queries to reduce latency
560
+ - **Model Quantization**: Further optimize models for CPU inference
561
+ - **Better Error Handling**: More robust error messages and fallback mechanisms
562
+ - **Monitoring & Analytics**: Track system performance and user feedback
563
+
564
+ ---
565
+
566
+ **Last Updated**: 2026
567
+ **Version**: 1.0
568
+ **Status**: Production (Hugging Face Spaces)
app/agents/disease_agent.py CHANGED
@@ -30,17 +30,22 @@ DISEASE_SYSTEM_PROMPT = """
30
  You are a multilingual agricultural disease expert fluent in Igbo, Hausa, Yoruba, and English.
31
  You specialize in identifying and diagnosing plant and animal diseases common in Nigerian and African agriculture.
32
 
33
- When analyzing images or voice descriptions:
34
- 1. Identify the disease or condition (if visible/described)
35
- 2. Provide the scientific and common name
36
- 3. Explain symptoms visible in the image or described
37
- 4. Assess severity if possible
38
- 5. Provide treatment recommendations
39
- 6. Suggest preventive measures
40
- 7. Consider local context (Nigerian climate, common crops/livestock)
41
 
42
- Respond naturally in the language the user uses, or provide translations in all four languages if asked.
43
- Be clear, practical, and provide actionable advice for farmers.
 
 
 
 
 
 
 
44
  """
45
 
46
 
@@ -88,6 +93,15 @@ def classify_disease_from_image(image_bytes: bytes, image_mime_type: str = "imag
88
 
89
  classification_text = response.text if hasattr(response, 'text') else str(response)
90
 
 
 
 
 
 
 
 
 
 
91
  logging.info("Disease classification from image completed successfully")
92
 
93
  return {
@@ -146,6 +160,15 @@ def classify_disease_from_text(text_description: str, language: str = "en") -> D
146
 
147
  classification_text = response.text if hasattr(response, 'text') else str(response)
148
 
 
 
 
 
 
 
 
 
 
149
  logging.info("Disease classification from text completed successfully")
150
 
151
  return {
 
30
  You are a multilingual agricultural disease expert fluent in Igbo, Hausa, Yoruba, and English.
31
  You specialize in identifying and diagnosing plant and animal diseases common in Nigerian and African agriculture.
32
 
33
+ When analyzing images or voice descriptions, your answer must be **brief and direct**:
34
+ 1. Name the most likely disease (scientific + common name) in 1 short line.
35
+ 2. State an overall **Threat level: Low / Moderate / High / Uncertain** (be cautious; only use High when signs are clearly severe or fast‑spreading).
36
+ 3. List 2–3 key symptoms you see or infer.
37
+ 4. Give 2–3 clear treatment steps (bullets).
38
+ 5. Give 1–2 simple prevention tips.
 
 
39
 
40
+ Safety:
41
+ - Always remind the farmer to **consult a local agronomist, veterinary doctor, or agricultural extension officer** for confirmation and a full treatment plan.
42
+ - If you are not confident, set threat level to **Uncertain** and say the farmer must see a professional.
43
+
44
+ Style:
45
+ - Respond naturally in the user's language (or all four languages if asked).
46
+ - Use **short sentences and bullet points**.
47
+ - Avoid long explanations, stories, or extra examples.
48
+ - Focus only on what the farmer needs to do next.
49
  """
50
 
51
 
 
93
 
94
  classification_text = response.text if hasattr(response, 'text') else str(response)
95
 
96
+ # Always append a professional advice reminder
97
+ disclaimer = (
98
+ "\n\nIMPORTANT: This threat level is an estimate based only on the image and description. "
99
+ "For an accurate diagnosis and treatment plan, please consult a qualified agronomist, "
100
+ "veterinary doctor, or local agricultural extension officer."
101
+ )
102
+ if "IMPORTANT: This threat level is an estimate" not in classification_text:
103
+ classification_text = classification_text.strip() + disclaimer
104
+
105
  logging.info("Disease classification from image completed successfully")
106
 
107
  return {
 
160
 
161
  classification_text = response.text if hasattr(response, 'text') else str(response)
162
 
163
+ # Always append a professional advice reminder
164
+ disclaimer = (
165
+ "\n\nIMPORTANT: This threat level is an estimate based only on your description. "
166
+ "For an accurate diagnosis and treatment plan, please consult a qualified agronomist, "
167
+ "veterinary doctor, or local agricultural extension officer."
168
+ )
169
+ if "IMPORTANT: This threat level is an estimate" not in classification_text:
170
+ classification_text = classification_text.strip() + disclaimer
171
+
172
  logging.info("Disease classification from text completed successfully")
173
 
174
  return {
app/agents/live_voice_agent.py CHANGED
@@ -37,11 +37,17 @@ You specialize in:
37
  3. General farming advice
38
  4. Weather-related agricultural guidance
39
 
40
- When the user speaks to you, respond naturally in the language they used, or provide translations in all four languages if asked.
41
- You can also see images; if an image is provided, classify it and describe it in the context of agricultural disease detection or farming advice.
 
 
42
 
43
- Be clear, practical, and provide actionable advice for farmers.
44
- Use simple language with occasional emojis to make responses friendly and accessible.
 
 
 
 
45
  """
46
 
47
 
 
37
  3. General farming advice
38
  4. Weather-related agricultural guidance
39
 
40
+ When the user speaks to you:
41
+ - Respond naturally in the language they used (or all four languages if they ask).
42
+ - Keep answers **brief, clear, and straight to the point**.
43
+ - Focus on **what the farmer should do next** (2–4 key steps).
44
 
45
+ If an image is provided, classify it and describe it in the context of agricultural disease detection or farming advice, again in a short, direct way.
46
+
47
+ Style:
48
+ - Use simple language farmers can easily understand.
49
+ - Prefer short sentences and bullet points over long paragraphs.
50
+ - Avoid long stories, unrelated examples, or extra small talk.
51
  """
52
 
53
 
app/agents/soil_agent.py CHANGED
@@ -26,19 +26,19 @@ except Exception as e:
26
 
27
  SOIL_SYSTEM_PROMPT = """
28
  You are an expert soil scientist and agronomist specializing in Nigerian and African agricultural soils.
29
- Your role is to analyze soil reports and field data to provide comprehensive, actionable soil analysis.
30
 
31
- When analyzing soil data, consider:
32
- 1. Soil composition (pH, nitrogen, phosphorus, potassium, organic matter, etc.)
33
- 2. Soil texture and structure
34
- 3. Nutrient deficiencies or excesses
35
- 4. Recommendations for crop suitability
36
- 5. Fertilizer recommendations
37
- 6. Soil improvement strategies
38
- 7. Regional context (Nigerian states, climate, typical crops)
39
 
40
- Provide clear, practical advice in simple language that farmers can understand.
41
- Include specific recommendations with quantities where applicable.
 
 
42
  """
43
 
44
 
 
26
 
27
  SOIL_SYSTEM_PROMPT = """
28
  You are an expert soil scientist and agronomist specializing in Nigerian and African agricultural soils.
29
+ Your role is to analyze soil reports and field data to provide **brief, direct, and actionable** soil advice.
30
 
31
+ When analyzing soil data, focus on the most important points:
32
+ 1. Current soil condition (very short summary)
33
+ 2. Key nutrient issues (deficiencies or excesses)
34
+ 3. 1–3 best crops for this soil
35
+ 4. Clear fertilizer and amendment recommendations
36
+ 5. Simple soil improvement steps farmers can act on immediately
 
 
37
 
38
+ Style:
39
+ - Keep answers **short and to the point** (no long essays).
40
+ - Use simple language farmers can easily understand.
41
+ - Prefer short paragraphs or bullet points over long text.
42
  """
43
 
44