negi2725's picture
Update README.md
884405f verified
metadata
title: Legal RAG Backend
emoji: ⚖️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860

Legal RAG Backend ⚖️

AI-powered legal verdict prediction and judgment generation system.

Features

  • LegalBERT Classification: Fine-tuned model for guilty/not guilty predictions

  • RAG Retrieval: Semantic search across 6 legal databases (Constitution, IPC, Case Law, Statutes, Q&A)

  • Gemini 2.5 Pro: Generates concise 400-600 word judicial judgments

  • Confidence-based Override: AI can override predictions below 80% confidence

  • Similar Case References: Mandatory precedent case citations

  • Google Gemini (optional) for generating detailed explanations

Legal RAG Backend ⚖️- HuggingFace Hub for model and dataset management

AI-powered legal verdict prediction and judgment generation system.## Project Structure

Features```

  • LegalBERT Classification: Fine-tuned model for guilty/not guilty predictionslegal-rag-backend/

  • RAG Retrieval: Semantic search across 6 legal databases (Constitution, IPC, Case Law, Statutes, Q&A)│

  • Gemini 2.5 Pro: Generates concise 400-600 word judicial judgments├── main.py # FastAPI application with REST endpoints

  • Confidence-based Override: AI can override predictions below 80% confidence├── model_loader.py # LegalBERT model loading and inference

  • Similar Case References: Mandatory precedent case citations├── rag_loader.py # FAISS indices and chunk loading from HuggingFace

├── rag_service.py # Core service orchestrating prediction and RAG

API Endpoints├── prompt_builder.py # Constructs prompts for LLM with legal context

├── utils.py # Helper utilities for chunk processing

GET /health├── requirements.txt # Python dependencies

Health check endpoint.├── Dockerfile # Container configuration

├── .gitignore # Git ignore patterns

POST /predict├── README.md # This file

Quick verdict prediction only.└── start.sh # Launch script


**Request:**

```json## Features

{

  "text": "Your legal case description..."### API Endpoints

}

```#### `GET /health`

Health check endpoint.

**Response:**

```json**Response:**

{```json

  "verdict": "guilty",{

  "confidence": 0.7795  "status": "ok"

}}

POST /explain ⭐ (Recommended)#### POST /predict

Complete legal analysis with judgment.Get a quick verdict prediction with confidence score.

**Request:**Request:

jsonjson

{{

"text": "Your legal case description..." "text": "Case description and facts..."

}}




**Response:****Response:**

```json```json

{{

  "verdict": "guilty",  "verdict": "guilty",

  "legalBertVerdict": "guilty",  "confidence": 0.8734

  "confidence": 0.7795,}

  "explanation": "Full 400-600 word judgment...",```

  "retrievedChunks": {...},

  "prompt": "Full prompt..."#### `POST /explain`

}Get comprehensive legal analysis with retrieved supporting documents.

```

**Request:**

## Models Used```json

- **LegalBERT**: `negi2725/LegalBertNew`{

- **Embeddings**: `BAAI/bge-large-en-v1.5`  "text": "Case description and facts..."

- **LLM**: `gemini-2.5-pro`}

```

## Data Sources

- Dataset: `negi2725/dataRag` (FAISS indices + legal documents)**Response:**

```json
{
  "verdict": "guilty",
  "confidence": 0.8734,
  "explanation": "Detailed legal analysis...",
  "retrievedChunks": {
    "constitution": [...],
    "ipc": [...],
    "ipcCase": [...],
    "statute": [...],
    "qa": [...],
    "case": [...]
  }
}
```

## Installation & Setup

### Local Development

1. **Clone or create the project:**
   ```bash
   cd /home/neginegi/Desktop/rag/legal-rag-backend
   ```

2. **Create a virtual environment:**
   ```bash
   python3 -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```

3. **Install dependencies:**
   ```bash
   pip install -r requirements.txt
   ```

4. **Configure environment (optional):**
   Create a `.env` file for Gemini API integration:
   ```bash
   GEMINI_API_KEY=your_api_key_here
   ```

5. **Run the server:**
   ```bash
   chmod +x start.sh
   ./start.sh
   ```
   
   Or directly:
   ```bash
   uvicorn main:app --host 0.0.0.0 --port 7860
   ```

6. **Access the API:**
   - API Documentation: http://localhost:7860/docs
   - Health Check: http://localhost:7860/health

### Docker Deployment

1. **Build the image:**
   ```bash
   docker build -t legal-rag-backend .
   ```

2. **Run the container:**
   ```bash
   docker run -p 7860:7860 -e GEMINI_API_KEY=your_key legal-rag-backend
   ```

3. **Access at:**
   http://localhost:7860

## How It Works

1. **Model Loading**: On startup, the system loads:
   - LegalBERT model (`negi2725/LegalBertNew`)
   - 6 FAISS indices (Constitution, IPC, Cases, Statutes, QA)
   - Corresponding text chunks
   - BGE-Large sentence transformer for embeddings

2. **Prediction Flow**:
   - Input text is tokenized and passed through LegalBERT
   - Softmax applied to get "guilty" or "not guilty" verdict
   - Confidence score extracted from probabilities

3. **RAG Retrieval**:
   - Query text embedded using BGE-Large
   - Top-K similar chunks retrieved from each FAISS index
   - Results organized by legal category

4. **Explanation Generation**:
   - Structured prompt built with case facts, verdict, and retrieved context
   - Optional Gemini API call for natural language explanation
   - Fallback to prompt template if API not configured

## Models & Datasets

- **LegalBERT Model**: `negi2725/LegalBertNew` (HuggingFace)
- **RAG Dataset**: `negi2725/dataRag` (HuggingFace)
- **Embedding Model**: `BAAI/bge-large-en-v1.5` (Sentence Transformers)

## Performance Notes

- All models and indices are preloaded at import time for fast inference
- Async endpoints ensure non-blocking I/O operations
- FAISS uses normalized L2 search for efficient similarity matching
- Typical response time: 1-3 seconds for `/explain` endpoint

## Requirements

- Python 3.10+
- 4GB+ RAM (8GB+ recommended for smooth operation)
- Internet connection for first-time model/dataset downloads

## Troubleshooting

**Models not downloading:**
- Ensure internet connectivity
- Check HuggingFace Hub access
- Models cache in `~/.cache/huggingface/`

**Out of memory:**
- Reduce batch size or top-K retrieval count
- Use CPU-only torch installation
- Consider using smaller embedding models

**Gemini API errors:**
- Verify API key in `.env` file
- System works without Gemini (returns structured prompt)
- Check API quota and rate limits

## Development

The codebase follows these conventions:
- CamelCase for variable names
- Minimal inline comments (self-documenting code)
- Async/await for all FastAPI endpoints
- Type hints for function signatures

## License

This project is for educational and research purposes.

## Support

For issues or questions, please refer to the HuggingFace model and dataset pages:
- https://huggingface.co/negi2725/LegalBertNew
- https://huggingface.co/datasets/negi2725/dataRag