Spaces:
Sleeping
TerraSyncra AI β Product & System Overview
1. Product Introduction
TerraSyncra is a multilingual agricultural intelligence agent designed specifically for Nigerian (and African) farmers. It provides comprehensive agricultural support through AI-powered assistance.
Key Capabilities:
- Agricultural Q&A: Answers questions about crops, livestock, soil, weather, pests, and diseases in multiple languages
- Soil Analysis: Provides expert soil health assessments from lab reports and field data using Gemini 3 Flash
- Disease Detection: Identifies plant and animal diseases from images, text descriptions, or voice input using Gemini 2.5 Flash
- Live Agricultural Updates: Delivers real-time weather information and agricultural news through RAG (Retrieval-Augmented Generation)
- Live Voice Interaction: Supports real-time voice conversations via WebSocket in local languages (Igbo, Hausa, Yoruba, English)
Developer: Ifeanyi Amogu Shalom
Target Users: Farmers, agronomists, agricultural extension officers, and agricultural support workers in Nigeria and similar contexts
2. Problem Statement
Nigerian smallholder farmers face significant challenges:
2.1 Limited Access to Agricultural Experts
- Scarcity of agronomists and veterinarians relative to the large farming population
- Geographic barriers preventing farmers from accessing expert advice
- High consultation costs that many smallholder farmers cannot afford
- Long waiting times for professional consultations, especially during critical periods (disease outbreaks, planting seasons)
2.2 Language Barriers
- Most agricultural information and resources are in English, while many farmers primarily speak Hausa, Igbo, or Yoruba
- Technical terminology is not easily accessible in local languages
- Translation services are often unavailable or unreliable
2.3 Fragmented Information Sources
- Weather data, soil reports, disease information, and market prices are scattered across different platforms
- No unified system to integrate and interpret multiple data sources
- Information overload without proper context or prioritization
2.4 Time-Sensitive Decision Making
- Disease outbreaks require immediate identification and treatment
- Weather changes affect planting, harvesting, and irrigation decisions
- Pest attacks can devastate crops if not addressed quickly
- Delayed responses lead to significant economic losses
2.5 Solution Approach
TerraSyncra addresses these challenges by providing:
- Fast, AI-powered responses available 24/7
- Multilingual support (English, Igbo, Hausa, Yoruba)
- Integrated intelligence combining expert models, RAG, and live data
- Accessible interface via text, voice, and image inputs
- Professional consultation reminders to ensure farmers seek expert confirmation when needed
3. System Architecture & Request Flows
3.1 General Agricultural Q&A β POST /ask
Step-by-Step Process:
Input Reception
- User sends
query(text) with optionalsession_idfor conversation continuity
- User sends
Language Detection
- FastText model (
facebook/fasttext-language-identification) detects input language - Supports: English, Igbo, Hausa, Yoruba
- FastText model (
Translation (if needed)
- If language β English, translates to English using NLLB (
drrobot9/nllb-ig-yo-ha-finetuned) - Preserves original language for back-translation
- If language β English, translates to English using NLLB (
Intent Detection
- Classifies query into categories:
- Weather question: Requests weather information (with/without Nigerian state)
- Live update: Requests current agricultural news or updates
- Normal question: General agricultural Q&A
- Low confidence: Falls back to RAG when intent is unclear
- Classifies query into categories:
Context Building
- Weather intent: Calls WeatherAPI for state-specific weather data, embeds summary into context
- Live update intent: Queries live FAISS vectorstore index for latest agricultural documents
- Low confidence: Falls back to static FAISS index for safer, more general responses
Conversation Memory
- Loads per-session history from
MemoryStore(TTL cache, 1-hour expiration) - Trims to
MAX_HISTORY_MESSAGES(default: 30) to prevent context overflow
- Loads per-session history from
Expert Model Generation
- Uses Qwen/Qwen1.5-1.8B (finetuned for Nigerian agriculture)
- Loaded lazily via
model_manager(CPU-optimized, first-use loading) - Builds chat messages: system prompt + conversation history + current user message + context
- System prompt restricts responses to agriculture/farming topics only
- Generates bounded-length answer (reduced token limit: 400 tokens for general, 256 for weather)
- Cleans response to remove any "Human: / Assistant:" style example continuations
Back-Translation
- If original language β English, translates answer back to user's language using NLLB
Response
- Returns JSON:
{ query, answer, session_id, detected_language }
- Returns JSON:
Safety & Focus:
- System prompt enforces agriculture-only topic handling
- Unrelated questions are redirected back to farming topics
- Response cleaning prevents off-topic example continuations
3.2 Soil Analysis β POST /analyze-soil
Step-by-Step Process:
Input Reception
report_data: Text description of soil report or lab results (required)- Optional fields:
location,crop_type,field_size,previous_crops,additional_notes
Agent Processing
soil_agent.analyze_soil()builds comprehensive prompt with:- Soil report data
- Field information (location, crop type, size, history)
- Regional context (Nigerian states, climate patterns)
Gemini API Call
- Model:
GEMINI_SOIL_MODEL = "gemini-3-flash-preview" - Prompt style: Brief, direct, actionable
- Focuses on:
- Current soil condition (short summary)
- Key nutrient issues (deficiencies or excesses)
- 1β3 best crops for this soil type
- Clear fertilizer and amendment recommendations
- Simple soil improvement steps
- Model:
Output
- JSON response:
{ success, analysis, model_used }
- JSON response:
Important Note:
Soil analysis is advisory only β not a formal agronomy diagnosis. The UI should encourage farmers to confirm with a local agronomist or extension officer for critical decisions.
3.3 Disease Detection
3.3.1 Image-Based Detection β POST /detect-disease-image
Step-by-Step Process:
Input Reception
- Image file (JPEG, PNG, etc.)
- Optional
query: Text description or question
Agent Processing
disease_agent.classify_disease_from_image()processes:- Image bytes + MIME type
- User query (if provided)
- Builds structured prompt for Gemini
Gemini API Call
- Model:
GEMINI_DISEASE_MODEL = "gemini-2.5-flash" - Prompt instructs Gemini to provide:
- Disease name (scientific + common name) in 1 short line
- Threat level: Low / Moderate / High / Uncertain (MANDATORY)
- 2β3 key symptoms visible in image
- 2β3 clear treatment steps (bullets)
- 1β2 simple prevention tips
- Brief, direct language with short sentences
- Model:
Backend Safety Enforcement
- Backend always appends disclaimer:
"IMPORTANT: This threat level is an estimate based only on the image/description. For an accurate diagnosis and treatment plan, please consult a qualified agronomist, veterinary doctor, or local agricultural extension officer."
- Backend always appends disclaimer:
Output
- JSON response:
{ success, classification, model_used, input_type }
- JSON response:
3.3.2 Text/Voice-Based Detection β POST /detect-disease-text
Step-by-Step Process:
Input Reception
description: Text description of disease symptoms or conditionlanguage: Language code (en, ig, ha, yo)
Agent Processing
disease_agent.classify_disease_from_text()processes:- Text description
- Language context
- Builds structured prompt for Gemini
Gemini API Call
- Same model and prompt structure as image-based detection
- Threat level assessment based on described symptoms
Backend Safety Enforcement
- Same disclaimer appended as image-based detection
Output
- JSON response:
{ success, classification, model_used, input_type }
- JSON response:
Threat Level Guidelines:
- Low: Mild or early-stage issue, unlikely to cause major losses if addressed soon
- Moderate: Noticeable risk that can reduce yield/health if not treated
- High: Serious or fast-spreading issue that can cause major losses or death (use cautiously, only when clearly severe)
- Uncertain: Insufficient or ambiguous data; model cannot safely rate risk (encouraged when not confident)
3.4 Live Voice Interaction β WS /live-voice & POST /live-voice-start
Step-by-Step Process:
WebSocket Connection
- Client connects to
/live-voiceendpoint - Optional: Send image as JSON (base64 encoded) at session start
- Audio chunks streamed as raw PCM bytes (16kHz, mono, 16-bit)
- Client connects to
Agent Processing
live_voice_agent.handle_live_voice_websocket()manages:- WebSocket connection lifecycle
- Image context (if provided)
- Audio streaming to Gemini Live API
- Audio response streaming back to client
Gemini Live API
- Model:
gemini-2.5-flashvia Gemini Live API - System prompt: Brief, clear, focused on "what to do next" (2β4 key steps)
- Supports: Disease detection, soil analysis, general farming, weather
- Prefers short sentences and bullet points
- Model:
Response Streaming
- Audio responses streamed back as PCM bytes
- Optional JSON messages for status/transcripts
Safety Expectations
- Same professional advice principle applies
- Frontends should display clear "not a replacement for a professional" banner
4. Technologies Used
4.1 Backend Framework & Infrastructure
- FastAPI: Modern Python web framework for building REST APIs and WebSocket endpoints
- Uvicorn: ASGI server for running FastAPI applications
- Python 3.10: Programming language
- Docker: Containerization for deployment
- Hugging Face Spaces: Deployment platform (Docker runtime, CPU-only environment)
4.2 Core Language Models
4.2.1 Expert Model: Qwen/Qwen1.5-1.8B
- Model:
Qwen/Qwen1.5-1.8B(via Hugging Face Transformers) - Purpose: Primary agricultural Q&A and conversation
- Specialization: Finetuned/specialized for Nigerian agricultural context through:
- Custom system prompts focused on Nigerian farming practices
- Domain-specific training data integration
- Response formatting optimized for agricultural advice
- Optimization:
- Lazy loading via
model_manager(loads on first use) - CPU-optimized inference (float32, device_map="cpu")
- Reduced token limits to prevent over-generation
- Lazy loading via
4.2.2 Gemini Models (Google AI)
- google-genai: Official Python client for Google's Gemini API
- gemini-3-flash-preview: Used for soil analysis
- gemini-2.5-flash: Used for disease detection and live voice interaction
- API Version: v1alpha for advanced features (disease detection, live voice)
4.3 Retrieval-Augmented Generation (RAG)
- LangChain: Framework for building LLM applications
- LangChain Community: Community integrations and tools
- SentenceTransformers:
- Model:
paraphrase-multilingual-MiniLM-L12-v2 - Purpose: Text embeddings for semantic search
- Model:
- FAISS (Facebook AI Similarity Search):
- Vector database for efficient similarity search
- Two indices: Static (general knowledge) and Live (current updates)
- APScheduler: Background job scheduler for periodic RAG updates
4.4 Language Processing
- FastText:
- Model:
facebook/fasttext-language-identification - Purpose: Language detection (English, Igbo, Hausa, Yoruba)
- Model:
- NLLB (No Language Left Behind):
- Model:
drrobot9/nllb-ig-yo-ha-finetuned - Purpose: Translation between English and Nigerian languages (Hausa, Igbo, Yoruba)
- Bidirectional translation support
- Model:
4.5 External APIs & Data Sources
- WeatherAPI:
- Provides state-level weather data for Nigerian states
- Real-time weather information integration
- AgroNigeria / HarvestPlus:
- Agricultural news feeds for RAG updates
- News scraping and processing
4.6 Additional Libraries
- transformers: Hugging Face library for loading and using transformer models
- torch: PyTorch (CPU-optimized version)
- numpy: Numerical computing
- requests: HTTP library for API calls
- beautifulsoup4: Web scraping for news aggregation
- python-multipart: File upload support for FastAPI
- python-dotenv: Environment variable management
5. Threat Level & Safety Policy
5.1 Domain Scope
- Plant and animal diseases only β NOT human health
- Focuses on agricultural and veterinary contexts
- Does not provide medical advice for humans
5.2 Threat Level Categories
Low
- Definition: Mild or early-stage issue, unlikely to cause major losses if addressed soon
- Characteristics:
- Localized symptoms
- Slow progression
- Easily manageable with standard treatments
- Example: Minor leaf spots, early nutrient deficiency
Moderate
- Definition: Noticeable risk that can reduce yield/health if not treated
- Characteristics:
- Moderate spread or impact
- Requires timely intervention
- Can cause economic losses if ignored
- Example: Moderate pest infestation, developing fungal infection
High
- Definition: Serious or fast-spreading issue that can cause major losses or death
- Characteristics:
- Rapid spread or severe symptoms
- High potential for significant economic impact
- May require immediate professional intervention
- Example: Severe bacterial blight, fast-spreading viral disease
- Usage Caution: Only assigned when signs are clearly severe or fast-spreading
Uncertain
- Definition: Insufficient or ambiguous data; model cannot safely rate risk
- Characteristics:
- Unclear symptoms
- Multiple possible diagnoses
- Poor image quality or vague description
- Usage: Encouraged when model is not confident β better to be uncertain than wrong
5.3 Accuracy & Caution Approach
Threat Level Assessment:
- Based only on image + description β no lab tests or physical examination
- Prompts instruct Gemini to be conservative and cautious
- Model encouraged to use
Uncertainwhen not clearly sure - Final responses always embed a strong "consult professionals" reminder
Professional Consultation Reminder:
- Backend always appends disclaimer to disease detection responses
- Frontends should visually emphasize: "This is not a medical/veterinary/agronomic diagnosis"
- System is a decision-support tool, not a definitive diagnostic engine
Important Note:
This system is a decision-support tool, not a definitive diagnosis engine.
All disease/threat outputs must be treated as preliminary guidance only.
Farmers should always consult qualified professionals for critical decisions.
6. Limitations & Issues Faced
6.1 Diagnostic Limitations
Input Quality Dependencies
- Image Quality: Blurry, poorly lit, or low-resolution images reduce accuracy
- Description Clarity: Vague or incomplete symptom descriptions limit diagnostic precision
- Context Missing: Lack of field history, crop variety, or environmental conditions affects recommendations
Inherent Limitations
- No Physical Examination: Cannot inspect internal plant structures or perform lab tests
- No Real-Time Monitoring: Cannot track disease progression over time
- Regional Variations: Some regional diseases may be under-represented in training data
- Seasonal Factors: Disease presentation may vary by season, which may not always be captured
6.2 Language & Translation Challenges
Translation Accuracy
- NLLB Limitations: Can misread slang, mixed-language (e.g., Pidgin + Hausa), or regional dialects
- Technical Terminology: Agricultural terms may not have direct translations, leading to approximations
- Context Loss: Subtle meaning can be lost across translation steps (user language β English β user language)
Language Detection
- FastText Edge Cases: May misclassify mixed-language inputs or code-switching
- Dialect Variations: Regional variations within languages may not be fully captured
6.3 Model Behavior Issues
Hallucination Risk
- Qwen/Gemini Limitations: Can generate confident but incorrect answers
- Mitigations Applied:
- Stricter system prompts with domain restrictions
- Shorter output limits (400 tokens for general, 256 for weather)
- Response cleaning to remove example continuations
- Topic redirection for unrelated questions
- Not Bulletproof: Hallucination can still occur, especially for edge cases
Response Drift
- Off-Topic Continuations: Models may continue with example conversations or unrelated content
- Mitigation: Response cleaning logic removes "Human: / Assistant:" patterns and unrelated content
6.4 Latency & Compute Constraints
First-Request Latency
- Model Loading: First Qwen/NLLB call is slower due to model + weights loading on CPU
- Cold Start: ~5-10 seconds for first request after deployment
- Subsequent Requests: Faster due to cached models in memory
CPU-Only Environment
- Inference Speed: CPU inference is slower than GPU (acceptable for Hugging Face Spaces CPU tier)
- Memory Constraints: Limited RAM requires careful model management (lazy loading, model caching)
6.5 External Dependencies
WeatherAPI Issues
- Outages: WeatherAPI downtime affects weather-related responses
- Rate Limits: API quota limits may restrict frequent requests
- Data Accuracy: Weather data quality depends on third-party provider
News Source Reliability
- Scraping Fragility: News sources may change HTML structure, breaking scrapers
- Update Frequency: RAG updates are scheduled; failures can cause stale information
- Content Quality: News article quality and relevance vary
6.6 RAG & Data Freshness
Update Scheduling
- Periodic Updates: RAG indices updated on schedule (not real-time)
- Job Failures: If update job fails, index can lag behind real-world events
- Index Rebuilding: Full index rebuilds can be time-consuming
Vectorstore Limitations
- Embedding Quality: Semantic search quality depends on embedding model performance
- Retrieval Accuracy: Retrieved documents may not always be most relevant
- Context Window: Limited context window may truncate important information
6.7 Deployment & Infrastructure
Hugging Face Spaces Constraints
- CPU-Only: No GPU acceleration available
- Memory Limits: Limited RAM requires optimization (lazy loading, model size reduction)
- Build Time: Docker builds can be slow, especially with large dependencies
- Cold Starts: Spaces may spin down after inactivity, causing cold start delays
Docker Build Issues
- Dependency Conflicts: Some Python packages may conflict (e.g., pyaudio requiring system libraries)
- Build Timeouts: Long build times may cause deployment failures
- Cache Management: Docker layer caching can be inconsistent
7. Recommended UX & Safety Reminders
7.1 Visual Disclaimers
Always display a clear banner near disease/soil results:
"β οΈ This is AI-generated guidance. Always confirm with a local agronomist, veterinary doctor, or agricultural extension officer before taking major actions."
7.2 Threat Level Display
- Visual Highlighting: Display threat level prominently with color coding:
- π’ Low: Green
- π‘ Moderate: Yellow
- π΄ High: Red
- βͺ Uncertain: Gray
- Tooltips: Provide explanations for each threat level
- Always Pair with Disclaimer: Never show threat level without the professional consultation reminder
7.3 Call-to-Action Buttons
Provide quick access to professional help:
- "Contact an Extension Officer" button/link
- "Find a Vet/Agronomist Near You" button/link
- "Schedule a Consultation" option (if available)
7.4 Response Quality Indicators
- Show confidence indicators when available (e.g., "High confidence" vs "Uncertain")
- Display input quality warnings (e.g., "Image quality may affect accuracy")
- Provide feedback mechanisms for users to report incorrect diagnoses
7.5 Language Support
- Clearly indicate detected language in responses
- Provide language switcher for users to change language preference
- Show translation quality warnings if translation may be approximate
8. System Summary
8.1 Problem Addressed
Nigerian smallholder farmers face critical challenges:
- Limited access to agricultural experts (agronomists, veterinarians)
- Language barriers (most resources in English, farmers speak Hausa/Igbo/Yoruba)
- Fragmented information sources (weather, soil, disease data scattered)
- Time-sensitive decision making (disease outbreaks, weather changes, pest attacks)
8.2 Solution Provided
TerraSyncra combines multiple AI technologies to provide:
- Fast, 24/7 AI-powered responses in multiple languages
- Integrated intelligence:
- Finetuned Qwen 1.8B expert model for agricultural Q&A
- Gemini 3/2.5 Flash for soil analysis and disease detection
- RAG + Weather + News for live, contextual information
- CPU-optimized, multilingual backend (FastAPI on Hugging Face Spaces)
- Multiple input modalities: Text, voice, and image support
8.3 Safety & Professional Consultation
Every disease assessment includes:
- Explicit Threat level (Low / Moderate / High / Uncertain)
- Clear professional consultation reminder
- Emphasis that threat levels are estimates, not definitive diagnoses
8.4 Key Technologies
- Expert Model: Qwen/Qwen1.5-1.8B (finetuned for Nigerian agriculture)
- Gemini Models: gemini-3-flash-preview (soil), gemini-2.5-flash (disease, voice)
- RAG: LangChain + FAISS + SentenceTransformers
- Language Processing: FastText (detection) + NLLB (translation)
- Backend: FastAPI + Uvicorn + Docker
- Deployment: Hugging Face Spaces (CPU-optimized)
8.5 Developer & Credits
Developer: Ifeanyi Amogu Shalom
Intended Users: Farmers, agronomists, agricultural extension officers, and agricultural support workers in Nigeria and similar contexts
9. Future Improvements & Roadmap
9.1 Potential Enhancements
- Model Fine-tuning: Further fine-tune Qwen on Nigerian agricultural datasets
- Multi-modal RAG: Integrate images into RAG for visual similarity search
- Offline Mode: Support for offline operation in areas with poor connectivity
- Mobile App: Native mobile applications for better user experience
- Expert Network Integration: Direct connection to network of agronomists/veterinarians
- Historical Tracking: Track disease progression and treatment outcomes over time
9.2 Technical Improvements
- Response Caching: Cache common queries to reduce latency
- Model Quantization: Further optimize models for CPU inference
- Better Error Handling: More robust error messages and fallback mechanisms
- Monitoring & Analytics: Track system performance and user feedback
Last Updated: 2026
Version: 1.0
Status: Production (Hugging Face Spaces)