Spaces:
Sleeping
Sleeping
| # Government Schemes RAG API Documentation (Multilingual) | |
| ## Overview | |
| FastAPI-based REST API for querying Indian Government Schemes using Retrieval-Augmented Generation (RAG) with **support for 13+ Indian languages**. | |
| ## Base URL | |
| ``` | |
| http://127.0.0.1:8000 | |
| ``` | |
| ## Key Features | |
| - ✅ Multilingual support (13+ Indian languages) | |
| - ✅ Automatic translation (Input & Output) | |
| - ✅ Text-to-Speech capability (optional) | |
| - ✅ RAG-powered intelligent search | |
| - ✅ 3400+ government schemes database | |
| ## API Endpoints | |
| ### 1. Root Endpoint | |
| **GET /** | |
| Returns API information, version, and supported languages. | |
| **Response:** | |
| ```json | |
| { | |
| "message": "Government Schemes RAG API with Multilingual Support", | |
| "version": "2.0.0", | |
| "supported_languages": { | |
| "en": "English", | |
| "hi": "Hindi", | |
| "te": "Telugu", | |
| "ta": "Tamil", | |
| "ml": "Malayalam", | |
| "kn": "Kannada", | |
| "bn": "Bengali", | |
| "mr": "Marathi", | |
| "gu": "Gujarati", | |
| "pa": "Punjabi", | |
| "ur": "Urdu", | |
| "or": "Odia", | |
| "as": "Assamese" | |
| }, | |
| "endpoints": { | |
| "POST /query": "Query government schemes with translation support", | |
| "GET /states": "Get list of Indian states", | |
| "GET /languages": "Get list of supported languages", | |
| "GET /health": "Health check" | |
| } | |
| } | |
| ``` | |
| --- | |
| ### 2. Health Check | |
| **GET /health** | |
| Check if the API and RAG system are running properly. | |
| **Response:** | |
| ```json | |
| { | |
| "status": "healthy", | |
| "rag_system": "initialized" | |
| } | |
| ``` | |
| --- | |
| ### 3. Get Supported Languages | |
| **GET /languages** | |
| Get list of all supported languages for translation. | |
| **Response:** | |
| ```json | |
| { | |
| "languages": { | |
| "en": "English", | |
| "hi": "Hindi", | |
| "te": "Telugu", | |
| "ta": "Tamil", | |
| "ml": "Malayalam", | |
| "kn": "Kannada", | |
| "bn": "Bengali", | |
| "mr": "Marathi", | |
| "gu": "Gujarati", | |
| "pa": "Punjabi", | |
| "ur": "Urdu", | |
| "or": "Odia", | |
| "as": "Assamese" | |
| } | |
| } | |
| ``` | |
| --- | |
| ### 4. Get States | |
| **GET /states** | |
| Get list of all Indian states and union territories. | |
| **Response:** | |
| ```json | |
| { | |
| "states": [ | |
| "All States", | |
| "Andhra Pradesh", | |
| "Arunachal Pradesh", | |
| ... | |
| ] | |
| } | |
| ``` | |
| --- | |
| ### 5. Query Schemes (with Multilingual Support) | |
| **POST /query** | |
| Query government schemes in any supported language. The API automatically translates the input to English, processes it through the RAG system, and returns the answer in the requested language. | |
| **Request Body:** | |
| ```json | |
| { | |
| "question": "స్కాలర్షిప్ల గురించి చెప్పండి", // Question in any language | |
| "state": "Telangana", // Optional | |
| "language": "te" // Language code (default: "en") | |
| } | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "answer": "తెలంగాణలో అందుబాటులో ఉన్న స్కాలర్షిప్ల గురించి...", | |
| "sources": [ | |
| "Scheme Name: Pre-Matric Scholarship for Backward Class Students...", | |
| "Scheme Name: Post-Matric Scholarship Scheme...", | |
| "Scheme Name: Merit-cum-Means Scholarship..." | |
| ] | |
| } | |
| ``` | |
| **Note:** Audio is NOT automatically generated. Use the `/generate-audio` endpoint when the user clicks the speaker button. | |
| **Translation Flow:** | |
| ``` | |
| User Question (Telugu) → Translate to English → RAG Processing → | |
| English Answer → Translate to Telugu → Return to User | |
| ``` | |
| --- | |
| ### 6. Generate Audio (On-Demand) | |
| **POST /generate-audio** | |
| Generate audio from text. This endpoint should be called ONLY when the user clicks the "Play Audio" or speaker button on the UI. | |
| **Request Body:** | |
| ```json | |
| { | |
| "text": "తెలంగాణలో అందుబాటులో ఉన్న స్కాలర్షిప్ల గురించి...", | |
| "language": "te" // Language code (default: "en") | |
| } | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "audio": "base64_encoded_mp3_audio_data" | |
| } | |
| ``` | |
| **Usage Flow:** | |
| ``` | |
| 1. User submits question → Receive answer (fast, no audio) | |
| 2. User clicks speaker button → Call /generate-audio → Play audio | |
| ``` | |
| **Error Response (400 - Empty Text):** | |
| ```json | |
| { | |
| "detail": "Text cannot be empty" | |
| } | |
| ``` | |
| **Error Response (400 - Unsupported Language):** | |
| ```json | |
| { | |
| "detail": "Unsupported language. Supported: ['en', 'hi', 'te', 'ta', ...]" | |
| } | |
| ``` | |
| **Error Response (400 - Empty Question):** | |
| ```json | |
| { | |
| "detail": "Question cannot be empty" | |
| } | |
| ``` | |
| **Error Response (400 - Unsupported Language):** | |
| ```json | |
| { | |
| "detail": "Unsupported language. Supported: ['en', 'hi', 'te', 'ta', ...]" | |
| } | |
| ``` | |
| **Error Response (500):** | |
| ```json | |
| { | |
| "detail": "Error processing query: [error message]" | |
| } | |
| ``` | |
| --- | |
| ## Interactive API Documentation | |
| FastAPI automatically generates interactive API documentation: | |
| - **Swagger UI**: http://127.0.0.1:8000/docs | |
| - **ReDoc**: http://127.0.0.1:8000/redoc | |
| These interfaces allow you to: | |
| - View all endpoints | |
| - See request/response schemas | |
| - Test API calls directly from the browser | |
| - Download OpenAPI specification | |
| --- | |
| ## Usage Examples | |
| ### Using cURL | |
| ```bash | |
| # Health check | |
| curl http://127.0.0.1:8000/health | |
| # Get supported languages | |
| curl http://127.0.0.1:8000/languages | |
| # Get states | |
| curl http://127.0.0.1:8000/states | |
| # Query in English | |
| curl -X POST http://127.0.0.1:8000/query \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "question": "What scholarships are available for SC students?", | |
| "state": "Karnataka", | |
| "language": "en" | |
| }' | |
| # Query in Hindi | |
| curl -X POST http://127.0.0.1:8000/query \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "question": "छात्रवृत्ति के बारे में बताएं", | |
| "language": "hi" | |
| }' | |
| # Query in Telugu | |
| curl -X POST http://127.0.0.1:8000/query \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "question": "స్కాలర్షిప్ల గురించి చెప్పండి", | |
| "state": "Telangana", | |
| "language": "te" | |
| }' | |
| # Generate audio (when user clicks speaker button) | |
| curl -X POST http://127.0.0.1:8000/generate-audio \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "text": "తెలంగాణలో అందుబాటులో ఉన్న స్కాలర్షిప్లు...", | |
| "language": "te" | |
| }' | |
| ``` | |
| ### Using Python requests | |
| ```python | |
| import requests | |
| # Query in English | |
| response = requests.post( | |
| "http://127.0.0.1:8000/query", | |
| json={ | |
| "question": "My daughter is studying in 9th standard. What schemes are applicable?", | |
| "state": "Maharashtra", | |
| "language": "en" | |
| } | |
| ) | |
| data = response.json() | |
| print(data["answer"]) | |
| # Query in Hindi | |
| response_hindi = requests.post( | |
| "http://127.0.0.1:8000/query", | |
| json={ | |
| "question": "मुझे छात्रवृत्ति चाहिए", | |
| "language": "hi" | |
| } | |
| ) | |
| hindi_data = response_hindi.json() | |
| print(hindi_data["answer"]) # Answer will be in Hindi | |
| # Generate audio on-demand (when user clicks speaker button) | |
| audio_response = requests.post( | |
| "http://127.0.0.1:8000/generate-audio", | |
| json={ | |
| "text": hindi_data["answer"], | |
| "language": "hi" | |
| } | |
| ) | |
| audio_data = audio_response.json() | |
| # audio_data["audio"] contains base64 encoded MP3 | |
| ``` | |
| ### Using JavaScript fetch | |
| ```javascript | |
| // Query in English | |
| const response = await fetch('http://127.0.0.1:8000/query', { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify({ | |
| question: 'What schemes are available for girl child education?', | |
| state: 'All States', | |
| language: 'en' | |
| }) | |
| }); | |
| const data = await response.json(); | |
| console.log(data.answer); | |
| // Query in Telugu | |
| const responseTelugu = await fetch('http://127.0.0.1:8000/query', { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify({ | |
| question: 'బాలికల విద్య కోసం ఏ పథకాలు ఉన్నాయి?', | |
| language: 'te' | |
| }) | |
| }); | |
| const teluguData = await responseTelugu.json(); | |
| console.log(teluguData.answer); // Answer in Telugu | |
| // Generate audio when user clicks speaker button | |
| const playAudio = async (text, language) => { | |
| const audioResponse = await fetch('http://127.0.0.1:8000/generate-audio', { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify({ | |
| text: text, | |
| language: language | |
| }) | |
| }); | |
| const audioData = await audioResponse.json(); | |
| const audio = new Audio(`data:audio/mp3;base64,${audioData.audio}`); | |
| audio.play(); | |
| }; | |
| // Usage: Call when user clicks speaker button | |
| // playAudio(teluguData.answer, 'te'); | |
| ``` | |
| ### Using React (Frontend Integration) | |
| ```jsx | |
| import React, { useState } from 'react'; | |
| function SchemeQuery() { | |
| const [language, setLanguage] = useState('en'); | |
| const [question, setQuestion] = useState(''); | |
| const [answer, setAnswer] = useState(''); | |
| const [audioLoading, setAudioLoading] = useState(false); | |
| const handleSubmit = async (e) => { | |
| e.preventDefault(); | |
| const response = await fetch('http://127.0.0.1:8000/query', { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ | |
| question: question, | |
| language: language | |
| }) | |
| }); | |
| const data = await response.json(); | |
| setAnswer(data.answer); | |
| }; | |
| // Called only when user clicks speaker button | |
| const playAudio = async () => { | |
| if (!answer) return; | |
| setAudioLoading(true); | |
| try { | |
| const response = await fetch('http://127.0.0.1:8000/generate-audio', { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ | |
| text: answer, | |
| language: language | |
| }) | |
| }); | |
| const data = await response.json(); | |
| const audio = new Audio(`data:audio/mp3;base64,${data.audio}`); | |
| audio.play(); | |
| } catch (error) { | |
| console.error('Audio generation failed:', error); | |
| } finally { | |
| setAudioLoading(false); | |
| } | |
| }; | |
| return ( | |
| <div> | |
| <select value={language} onChange={e => setLanguage(e.target.value)}> | |
| <option value="en">English</option> | |
| <option value="hi">Hindi</option> | |
| <option value="te">Telugu</option> | |
| <option value="ta">Tamil</option> | |
| </select> | |
| <form onSubmit={handleSubmit}> | |
| <input | |
| value={question} | |
| onChange={e => setQuestion(e.target.value)} | |
| placeholder="Ask your question..." | |
| /> | |
| <button type="submit">Ask</button> | |
| </form> | |
| {answer && ( | |
| <div> | |
| <p>{answer}</p> | |
| <button onClick={playAudio} disabled={audioLoading}> | |
| {audioLoading ? '⏳ Generating...' : '🔊 Play Audio'} | |
| </button> | |
| </div> | |
| )} | |
| </div> | |
| ); | |
| } | |
| ``` | |
| const response = await fetch('http://127.0.0.1:8000/query', { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ | |
| question: question, | |
| language: language | |
| }) | |
| }); | |
| const data = await response.json(); | |
| setAnswer(data.answer); | |
| }; | |
| return ( | |
| <div> | |
| <select value={language} onChange={e => setLanguage(e.target.value)}> | |
| <option value="en">English</option> | |
| <option value="hi">Hindi</option> | |
| <option value="te">Telugu</option> | |
| <option value="ta">Tamil</option> | |
| </select> | |
| <form onSubmit={handleSubmit}> | |
| <input | |
| value={question} | |
| onChange={e => setQuestion(e.target.value)} | |
| placeholder="Ask your question..." | |
| /> | |
| <button type="submit">Ask</button> | |
| </form> | |
| {answer && <div>{answer}</div>} | |
| </div> | |
| ); | |
| } | |
| ``` | |
| ### Using Postman | |
| 1. **Method**: POST | |
| 2. **URL**: `http://127.0.0.1:8000/query` | |
| 3. **Headers**: | |
| - `Content-Type: application/json` | |
| 4. **Body** (raw JSON): | |
| **English:** | |
| ```json | |
| { | |
| "question": "What are the schemes for construction workers?", | |
| "state": "Karnataka", | |
| "language": "en" | |
| } | |
| ``` | |
| **Hindi:** | |
| ```json | |
| { | |
| "question": "निर्माण श्रमिकों के लिए क्या योजनाएं हैं?", | |
| "language": "hi" | |
| } | |
| ``` | |
| **Telugu:** | |
| ```json | |
| { | |
| "question": "నిర్మాణ కార్మికులకు ఏ పథకాలు ఉన్నాయి?", | |
| "state": "Telangana", | |
| "language": "te" | |
| } | |
| ``` | |
| --- | |
| ## Running the API | |
| ### Start the Server | |
| ```bash | |
| # Activate virtual environment | |
| .venv\Scripts\activate | |
| # Run the API | |
| python app.py | |
| ``` | |
| The API will start on `http://0.0.0.0:8000` | |
| ### Testing the API | |
| Run the test script: | |
| ```bash | |
| python test_api.py | |
| ``` | |
| --- | |
| ## CORS Configuration | |
| The API is configured to accept requests from any origin (`allow_origins=["*"]`). | |
| ⚠️ **For production**, update the CORS settings in `app.py`: | |
| ```python | |
| app.add_middleware( | |
| CORSMiddleware, | |
| allow_origins=["https://yourdomain.com"], # Specify allowed origins | |
| allow_credentials=True, | |
| allow_methods=["*"], | |
| allow_headers=["*"], | |
| ) | |
| ``` | |
| --- | |
| ## Data Source | |
| The API uses `updated_data.csv` containing **3400+ government schemes** across categories: | |
| - Education & Learning | |
| - Social Welfare & Empowerment | |
| - Health & Wellness | |
| - Business & Entrepreneurship | |
| - Women and Child | |
| - And more... | |
| --- | |
| ## Technology Stack | |
| - **Framework**: FastAPI 0.104.1 | |
| - **LLM**: Groq API (llama-3.3-70b-versatile) | |
| - **Embeddings**: HuggingFace sentence-transformers/all-MiniLM-L6-v2 | |
| - **Vector DB**: ChromaDB | |
| - **RAG Framework**: LangChain 0.1.0 | |
| - **Translation**: deep-translator 1.11.4 (Google Translate) | |
| - **Text-to-Speech**: gTTS 2.5.0 (Google Text-to-Speech) | |
| - **Server**: Uvicorn | |
| --- | |
| ## Multilingual Features | |
| ### Translation Process | |
| 1. **Input Translation**: User's question in any Indian language → English | |
| 2. **RAG Processing**: English query → Vector search → LLM inference → English answer | |
| 3. **Output Translation**: English answer → User's selected language | |
| ### Supported Language Codes | |
| | Code | Language | Code | Language | | |
| |------|----------|------|----------| | |
| | `en` | English | `ml` | Malayalam | | |
| | `hi` | Hindi | `kn` | Kannada | | |
| | `te` | Telugu | `bn` | Bengali | | |
| | `ta` | Tamil | `mr` | Marathi | | |
| | `gu` | Gujarati | `pa` | Punjabi | | |
| | `ur` | Urdu | `or` | Odia | | |
| | `as` | Assamese | | | | |
| ### Text-to-Speech (Optional) | |
| To enable audio responses, uncomment the following lines in `app.py`: | |
| ```python | |
| # Line ~280 in app.py | |
| audio_base64 = TranslationService.text_to_speech(final_answer, request.language) | |
| ``` | |
| When enabled, the API will return base64-encoded MP3 audio in the `audio` field. | |
| --- | |
| ## Testing the Multilingual API | |
| ### Using the Test Script | |
| ```bash | |
| # Make sure the server is running first | |
| python app.py | |
| # In another terminal | |
| python test_translation.py | |
| ``` | |
| The test script will: | |
| 1. Verify language endpoint | |
| 2. Test queries in English, Hindi, Telugu, Tamil, and Malayalam | |
| 3. Display translated responses | |
| ### Manual Testing Checklist | |
| - [ ] Test each supported language | |
| - [ ] Verify translations are accurate | |
| - [ ] Check source citations are included | |
| - [ ] Test with state filters | |
| - [ ] Test error handling (empty questions, invalid languages) | |
| - [ ] Verify CORS headers for frontend integration | |
| --- | |
| ## Rate Limits | |
| Currently, there are no rate limits implemented. The API uses Groq's free tier which has its own rate limits. | |
| For production deployment, consider implementing: | |
| - Request rate limiting | |
| - Authentication/API keys | |
| - Caching for common queries | |
| --- | |
| ## Error Handling | |
| - **400 Bad Request**: Invalid or empty question | |
| - **500 Internal Server Error**: Processing error (check GROQ_API_KEY) | |
| --- | |
| ## Performance Notes | |
| - First query may take 3-5 seconds (vector search + LLM inference) | |
| - Subsequent queries are faster (~1-2 seconds) | |
| - ChromaDB is persisted to disk (./chroma_db/) for faster restarts | |
| - 3400 schemes are chunked into ~12,000-15,000 text segments | |
| --- | |
| ## Deployment | |
| ### Local Development | |
| ```bash | |
| uvicorn app:app --reload --host 0.0.0.0 --port 8000 | |
| ``` | |
| ### Production (with Gunicorn) | |
| ```bash | |
| pip install gunicorn | |
| gunicorn app:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 | |
| ``` | |
| ### Docker (Optional) | |
| Create a `Dockerfile`: | |
| ```dockerfile | |
| FROM python:3.11-slim | |
| WORKDIR /app | |
| COPY requirements.txt . | |
| RUN pip install --no-cache-dir -r requirements.txt | |
| COPY . . | |
| CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"] | |
| ``` | |
| --- | |
| ## Support | |
| For issues or questions: | |
| 1. Check API docs at `/docs` | |
| 2. Review logs in terminal | |
| 3. Verify `.env` file has valid `GROQ_API_KEY` | |
| 4. Ensure `updated_data.csv` is present | |
| --- | |
| ## Example Queries (Multilingual) | |
| ### English | |
| - "My daughter is studying in 9th standard. What schemes are applicable to her?" | |
| - "What scholarships are available for SC/ST students?" | |
| - "What are the schemes for construction workers?" | |
| - "Tell me about Beti Bachao Beti Padhao scheme" | |
| ### Hindi (हिंदी) | |
| - "मेरी बेटी 9वीं कक्षा में पढ़ती है। उसके लिए कौन सी योजनाएं हैं?" | |
| - "SC/ST छात्रों के लिए क्या छात्रवृत्ति उपलब्ध है?" | |
| - "निर्माण श्रमिकों के लिए क्या योजनाएं हैं?" | |
| - "बेटी बचाओ बेटी पढ़ाओ योजना के बारे में बताएं" | |
| ### Telugu (తెలుగు) | |
| - "నా కూతురు 9వ తరగతి చదువుతోంది. ఆమెకు ఏ పథకాలు వర్తిస్తాయి?" | |
| - "SC/ST విద్యార్థులకు ఏ స్కాలర్షిప్లు అందుబాటులో ఉన్నాయి?" | |
| - "నిర్మాణ కార్మికులకు ఏ పథకాలు ఉన్నాయి?" | |
| ### Tamil (தமிழ்) | |
| - "என் மகள் 9வது வகுப்பு படிக்கிறாள். அவளுக்கு என்ன திட்டங்கள் பொருந்தும்?" | |
| - "SC/ST மாணவர்களுக்கு என்ன உதவித்தொகை கிடைக்கும்?" | |
| ### Malayalam (മലയാളം) | |
| - "എന്റെ മകൾ 9-ാം ക്ലാസിൽ പഠിക്കുന്നു. അവൾക്ക് എന്തെല്ലാം പദ്ധതികൾ ബാധകമാണ്?" | |
| - "SC/ST വിദ്യാർത്ഥികൾക്ക് എന്ത് സ്കോളർഷിപ്പുകൾ ലഭ്യമാണ്?" | |
| --- | |
| ## Frontend Integration Guide | |
| For detailed React integration instructions, see: **MULTILINGUAL_INTEGRATION_GUIDE.md** | |
| Key points for frontend developers: | |
| 1. Always send `language` parameter with queries | |
| 2. Backend handles ALL translation - no frontend translation needed | |
| 3. Use Web Speech API for voice input (browser native) | |
| 4. Use Speech Synthesis API for voice output (browser native) | |
| 5. Display loading states during translation/query processing | |
| --- | |
| ## Performance & Optimization | |
| ### Response Times | |
| - **Translation**: ~0.5-1 second per translation | |
| - **RAG Query**: ~2-3 seconds | |
| - **Total**: ~3-5 seconds for multilingual queries | |
| - **English-only**: ~2-3 seconds (no translation overhead) | |
| ### Optimization Tips | |
| 1. **Cache translations** for common queries | |
| 2. **Lazy load audio** - only generate when user clicks "Play" | |
| 3. **Use connection pooling** for API calls | |
| 4. **Implement request debouncing** in frontend | |
| 5. **Add response caching** for identical queries | |
| ### Scaling Considerations | |
| - Translation uses free Google Translate API (via deep-translator) | |
| - No rate limits on translation service currently | |
| - Groq API has free tier limits (check console.groq.com) | |
| - Consider premium APIs for production (Azure Translator, Google Cloud Translation) | |