Setu π³π΅
An AI-powered platform for legal assistance in Nepal - making legal documents accessible, generating official letters, and detecting bias in legal text.
π― Project Overview
Setu is a comprehensive legal assistance platform that leverages AI/ML to help Nepali citizens interact with legal documents and government processes. The system consists of three main modules integrated with a modern web interface.
π₯ Demo Video
Watch the platform in action: View Demo Video
π Features
Module A: Law Explanation (RAG-Based Chatbot)
- Intelligent Q&A: Ask questions about Nepali laws in natural language (English/Nepali)
- Retrieval-Augmented Generation: Retrieves relevant legal text and generates accurate explanations
- Source References: Provides exact article/section references
- Vector Database: ChromaDB with semantic search capabilities
Module B: Multi-Category Bias Detection
- 10+ Bias Categories: Detects gender, caste, religion, age, disability, appearance, social status, political, and ambiguity biases
- Fine-tuned DistilBERT: Custom model trained on Nepali legal texts
- Sentence Analysis: Analyzes individual sentences or batch processing
- Debiasing Suggestions: Provides bias-free alternatives for detected biases
- Confidence Scoring: Returns confidence scores for each detection
Module C: Letter Generation
- Template-Based Generation: RAG-based intelligent template selection
- Natural Language Input: Describe your need, get the right letter
- Smart Field Extraction: Automatically extracts name, date, district, etc.
- Official Formats: Generates proper Nepali government letter formats
Utility: PDF Processing
- Text Extraction: Extract text from legal PDFs (English & Nepali)
- Multi-method Support: PyMuPDF, pdfplumber with intelligent fallback
- OCR Ready: Handles scanned documents
- Integrated Pipeline: Direct integration with bias detection
π οΈ Tech Stack
Backend:
- FastAPI (Python) - RESTful API
- ChromaDB - Vector database for embeddings
- Mistral AI - LLM for generation
- Sentence Transformers - Embeddings
- PyMuPDF, PDFPlumber - PDF processing
Frontend:
- Next.js 16 - React framework
- TypeScript - Type safety
- Tailwind CSS - Styling
- Radix UI - Component library
- shadcn/ui - UI components
ML/AI:
- Hugging Face Transformers
- Sentence Transformers
- Custom fine-tuned models (Module B)
π Prerequisites
- Python: 3.9+ (recommended: 3.13)
- Node.js: 18+ with pnpm
- API Keys: Mistral AI API key
- System: Linux/macOS/Windows
βοΈ Installation
1. Clone the Repository
git clone https://github.com/KhagendraN/Setu.git
cd Setu
2. Backend Setup
Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Create .env file in the project root:
MISTRAL_API_KEY=your_mistral_api_key_here
3. Build Vector Databases
Module A (Law Explanation):
# Place your legal PDFs in data/module-A/law/
python -m module_a.process_documents
python -m module_a.build_vector_db
Module C (Letter Generation):
# Templates are already in data/module-C/
python -m module_c.indexer
4. Frontend Setup
cd Frontend
pnpm install
cd ..
π Running the Application
You need TWO terminals to run the full application:
Terminal 1: Backend API
# Activate virtual environment
source venv/bin/activate
# Start the API server
uvicorn api.main:app --reload --port 8000
Backend will run at: http://localhost:8000
API docs available at: http://localhost:8000/docs
Terminal 2: Frontend
cd Frontend
pnpm dev
Frontend will run at: http://localhost:3000
π³ Docker Usage (Recommended)
The easiest way to run the entire platform is using Docker Compose.
1. Prerequisites
- Docker and Docker Compose installed
.envfile withMISTRAL_API_KEYin the root directory
2. Run with Docker Compose
docker-compose up --build
This will:
- Build and start the Backend API (port 8000)
- Build and start the Frontend (port 3000)
- Automatically run the vector database build scripts
The application will be available at http://localhost:3000.
π Project Structure
Setu/
βββ api/ # Main API endpoints
β βββ main.py # FastAPI application
β βββ routes/
β β βββ law_explanation.py # Module A endpoints
β β βββ letter_generation.py # Module C endpoints
β β βββ bias_detection.py # Module B endpoints
β β βββ pdf_processing.py # PDF utility endpoints
β βββ schemas.py # Pydantic models
β
βββ module_a/ # Law Explanation (RAG)
β βββ rag_chain.py # RAG pipeline
β βββ vector_db.py # ChromaDB interface
β βββ process_documents.py # Document processing
β βββ README.md
β
βββ module_b/ # Bias Detection
β βββ inference.py # Model inference
β βββ fine_tuning/ # Training scripts
β βββ dataset/ # Training data
β
βββ module_c/ # Letter Generation
β βββ interface.py # Main API
β βββ retriever.py # Template retrieval
β βββ generator.py # Letter generation
β βββ indexer.py # Vector DB indexing
β βββ README.md
β
βββ utility/ # PDF Processing
β βββ pdf_processor.py # PDF extraction
β βββ README.md
β
βββ Frontend/ # Next.js application
β βββ app/
β β βββ chatbot/ # Module A UI
β β βββ letter-generator/ # Module C UI
β β βββ bias-checker/ # Module B UI
β β βββ dashboard/ # Main dashboard
β β βββ login/ # Authentication pages
β βββ components/ # Reusable components
β
βββ data/ # Data storage
βββ module-A/ # Law documents & vector DB
βββ module-C/ # Letter templates & vector DB
βββ module-B/ # Bias detection datasets
π API Endpoints
Authentication
POST /api/v1/signup- Register a new userPOST /api/v1/login- User loginGET /api/v1/me- Get current user profilePOST /api/v1/refresh- Refresh access token
Law Explanation (Module A)
POST /api/v1/law-explanation/explain- Ask legal questions (basic)POST /api/v1/law-explanation/chat- Context-aware chat with conversation historyGET /api/v1/law-explanation/sources- Get source documents only
Chat History
POST /api/v1/chat-history/conversations- Create a new conversationGET /api/v1/chat-history/conversations- List all user conversationsGET /api/v1/chat-history/conversations/{id}- Get specific conversation with messagesDELETE /api/v1/chat-history/conversations/{id}- Delete a conversationPOST /api/v1/chat-history/messages- Save a message to conversation
Letter Generation (Module C)
POST /api/v1/search-template- Search for letter templatesPOST /api/v1/get-template-details- Get template requirementsPOST /api/v1/fill-template- Fill template with user dataPOST /api/v1/generate-letter- Generate complete letter (smart generation)POST /api/v1/analyze-requirements- Analyze missing fields in template
Bias Detection (Module B)
POST /api/v1/detect-bias- Detect bias in textPOST /api/v1/detect-bias/batch- Batch bias detectionPOST /api/v1/debias-sentence- Get debiased alternativesPOST /api/v1/debias-sentence/batch- Batch debiasingGET /api/v1/health- Health check
Bias Detection HITL (Human-in-the-Loop)
POST /api/v1/bias-detection-hitl/detect- Detect bias with HITL workflowPOST /api/v1/bias-detection-hitl/approve- Approve bias detection resultsPOST /api/v1/bias-detection-hitl/regenerate- Regenerate debiased suggestionsPOST /api/v1/bias-detection-hitl/generate-pdf- Generate PDF report
PDF Processing (Utility)
POST /api/v1/process-pdf- Extract text from PDFPOST /api/v1/process-pdf-to-bias- Extract PDF and detect biasGET /api/v1/pdf-health- Health check
System
GET /- API welcome messageGET /health- System health check
Full API documentation: http://localhost:8000/docs (when server is running)
π¨ Frontend Features
- Dashboard: Overview of all modules
- Chatbot: Interactive law explanation interface
- Letter Generator: Step-by-step letter creation wizard
- Bias Checker: Upload documents or paste text for analysis
- User Profile: User account management
- Responsive Design: Works on desktop and mobile
π§ͺ Testing
Test Module A (Law Explanation)
python -m module_a.test_rag
Test Module C (Letter Generation)
python -m module_c.test_generation
python -m module_c.test_interactive
Test PDF Processing
python -m utility.test_pdf_processor
Test API Endpoints
python -m api.test_api
π Configuration
Environment Variables (.env)
# Required
MISTRAL_API_KEY=your_api_key_here
# Optional - MongoDB (if using Auth Backend)
# MONGODB_URL=mongodb://localhost:27017
# SECRET_KEY=your_secret_key
Module Configurations
- Module A: module_a/config.py
- Module C: module_c/config.py
π Troubleshooting
Backend Issues
- Import errors: Make sure virtual environment is activated
- Vector DB empty: Run the build scripts for modules A & C
- API key errors: Check
.envfile has validMISTRAL_API_KEY
Frontend Issues
- Port 3000 in use: Change port with
pnpm dev -- -p 3001 - Module not found: Run
pnpm installin Frontend directory - API connection failed: Ensure backend is running on port 8000
Common Errors
# Reinstall dependencies
pip install --upgrade -r requirements.txt
# Rebuild vector databases
python -m module_a.build_vector_db
python -m module_c.indexer
# Clear pnpm cache
cd Frontend
pnpm store prune
pnpm install
π Documentation
- Module A Documentation - Law Explanation RAG Pipeline
- Module C Documentation - Letter Generation
- PDF Processing Guide - PDF text extraction
- Implementation Guides - Detailed implementation workflows
This project is under development as part of a hackathon.