Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available: 6.13.0
metadata
title: Multilingual Twi Health Information System
emoji: π
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 4.3.0
python_version: '3.10'
app_file: app.py
π Multilingual Twi Health Information System
A semantic search system that allows users to ask health-related questions in any language (Twi, Ga, Ewe, Hausa, English, etc.) and receive relevant answers in Twi.
π― Key Features
- π Multilingual Input: Ask questions in any language
- π¬π Twi Answers: All responses are provided in Twi
- π Semantic Search: Uses E5-Multilingual-Large embeddings for accurate matching
- β‘ Fast Retrieval: FAISS-powered search across millions of Q&A pairs
- π Three-Paragraph Format: Presents top 3 most relevant answers
π‘ How It Works
- User Input: Enter a question in any supported language
- Embedding: Question is encoded using E5-Multilingual-Large model
- Search: FAISS finds the top 3 most similar English questions
- Response: Corresponding Twi answers are presented as 3 paragraphs
π Quick Start
For Users
Simply visit the Space and:
- Type your question in any language
- Click "HwehwΙ | Search"
- Receive 3 relevant answers in Twi
- Optional: Enable "Show matched questions" to see source questions
For Developers
Step 1: Create Embeddings (Google Colab)
# Upload your CSV with columns: question, answer
# Run the embedding creation script
# Download: faiss_index.bin, metadata.json, config.json
Step 2: Deploy to Hugging Face Spaces
- Create a new Gradio Space
- Upload required files:
app.pyrequirements.txtfaiss_index.binmetadata.jsonconfig.json
- Space will automatically build and deploy
π File Structure
.
βββ app.py # Main Gradio application
βββ requirements.txt # Python dependencies
βββ faiss_index.bin # FAISS index (English questions)
βββ metadata.json # Question-answer pairs
βββ config.json # Model configuration
βββ README.md # This file
π§ Technical Details
Model Architecture
- Embedding Model:
intfloat/multilingual-e5-large - Embedding Dimension: 1024
- Supported Languages: 100+ languages including:
- Twi, Ga, Ewe, Hausa
- English, French, Arabic
- And many more
Search Configuration
- Similarity Metric: Cosine Similarity (Inner Product)
- Index Type: FAISS Flat Index
- Top-K Results: 3 (configurable)
- Encoding Strategy:
- Questions:
passage:prefix - User queries:
query:prefix
- Questions:
Performance
- Search Speed: < 100ms for millions of records
- Accuracy: State-of-the-art multilingual semantic matching
- Scalability: Handles millions of Q&A pairs
π Data Format
Input CSV
question,answer
"Alcohol while breastfeeding?","Nufu yΙ papa ma wo ba..."
"Ambulance service number?","Ζfa awoΙ mu ahohiahia..."
Requirements
- Columns:
question(English),answer(Twi) - Format: UTF-8 encoded CSV
- Size: Tested with 3M+ rows
π Example Queries
| Language | Question | Result |
|---|---|---|
| English | "What should I do about alcohol while breastfeeding?" | β Finds relevant answer |
| Twi | "DΙn na menyΙ fa alcohol ho wΙ nufunom bere mu?" | β Finds relevant answer |
| Ga | "MΙnya ambulance service frΙ nΙma no?" | β Finds relevant answer |
π Privacy & Security
- β No user data is stored
- β All processing happens in real-time
- β No tracking or analytics
- β Open-source and transparent
π οΈ Customization
Adjust Number of Results
# In app.py, modify:
results = search_answers(query, top_k=5) # Change from 3 to 5
Change Response Format
# Modify format_response() function in app.py
# Customize how answers are presented
Add New Languages
The system automatically supports 100+ languages through E5-Multilingual-Large. No configuration needed!
π Performance Optimization
For large datasets:
- Use Git LFS for
faiss_index.bin(>100MB) - Consider quantized FAISS indices for faster search
- Enable caching for frequently asked questions
π€ Contributing
Contributions are welcome! Areas for improvement:
- Add more Q&A pairs
- Improve answer formatting
- Add audio input/output
- Implement feedback mechanism
π License
[Add your license here]
π Acknowledgments
- Model: E5-Multilingual-Large by Microsoft
- Framework: Sentence Transformers, FAISS, Gradio
- Data: [Add your data source]
π Support
For issues or questions:
- Open an issue in the Space discussions
- Contact: [Your contact information]
Medaase! | Thank you! π¬π