setu / docs /CONTEXT_CHAT_GUIDE.md
khagu's picture
chore: finally untrack large database files
3998131

Context-Aware Chat Implementation Guide

Overview

The context-aware chat system intelligently handles conversation continuity by:

  1. Detecting non-legal queries (greetings, thanks, etc.) and responding appropriately without RAG
  2. Analyzing message independence to determine if a message requires previous context
  3. Summarizing conversations when messages are dependent on previous context
  4. Optimizing RAG calls by only sending relevant context to the retrieval pipeline

Architecture Flow

User Message
    ↓
Fetch Last 5 Conversations (if conversation_id provided)
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Context Analysis        β”‚
β”‚  (Mistral LLM)          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              β”‚                  β”‚                  β”‚
Non-Legal?     Independent?       Dependent?
β”‚              β”‚                  β”‚
↓              ↓                  ↓
Simple         Send current       Summarize +
Response       message to RAG     Send to RAG

API Endpoints

1. Context-Aware Chat Endpoint

Endpoint: POST /law-explanation/chat

Request:

{
  "query": "He is making fake allegations",
  "conversation_id": "uuid-of-conversation"  // Optional
}

Response:

{
  "summary": "Brief answer",
  "key_point": "Key legal point",
  "explanation": "Detailed explanation",
  "next_steps": "Actionable advice",
  "sources": [...],
  "query": "Original or processed query",
  "context_used": true,              // New field
  "is_non_legal": false,             // New field
  "original_query": "He is making...", // Present if context used
  "summarized_query": "My brother is making fake allegations..." // Present if context used
}

2. Traditional Explain Endpoint (Unchanged)

Endpoint: POST /law-explanation/explain

This endpoint remains unchanged and does not use conversation context.

Usage Examples

Example 1: Dependent Conversation

Message 1:

{
  "query": "I had a fight with my brother over property",
  "conversation_id": "conv-123"
}

Response 1:

  • context_used: false (no previous messages)
  • Returns explanation about property disputes

Message 2:

{
  "query": "He is making fake allegations",
  "conversation_id": "conv-123"
}

Response 2:

  • context_used: true (dependent on previous message)
  • summarized_query: "My brother is making fake allegations against me in a property dispute"
  • RAG receives the summarized context instead of just "He is making..."

Example 2: Independent New Topic

Message 3:

{
  "query": "How do I apply for citizenship?",
  "conversation_id": "conv-123"
}

Response 3:

  • context_used: false (independent new topic)
  • Query sent to RAG as-is without previous context

Example 3: Non-Legal Query

Message 4:

{
  "query": "Thank you so much!",
  "conversation_id": "conv-123"
}

Response 4:

  • is_non_legal: true
  • explanation: "You're welcome! I'm glad I could help..."
  • No RAG call made (saves cost and time)

Configuration

Adjustable Parameters

In api/routes/chat_history.py:

# Change number of messages to fetch for context
context = await get_recent_context(
    conversation_id=conversation_id,
    user_id=user["id"],
    limit=5  # Adjust this (default: 5)
)

In module_a/context_analyzer.py:

# Adjust LLM model for context analysis
class ConversationContextAnalyzer:
    def __init__(self, model: str = "mistral-small-latest"):
        # Options: mistral-tiny, mistral-small-latest, mistral-medium

Testing

Manual Testing via cURL

# 1. Login
curl -X POST http://localhost:8000/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email": "test@example.com", "password": "password"}'

# 2. Create conversation
curl -X POST http://localhost:8000/chat-history/conversations \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"title": "Test Chat"}'

# 3. Send messages
curl -X POST http://localhost:8000/law-explanation/chat \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "I had a fight with my brother", "conversation_id": "CONV_ID"}'

curl -X POST http://localhost:8000/law-explanation/chat \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "He is making allegations", "conversation_id": "CONV_ID"}'

Automated Testing

Run the provided test script:

cd api
python test_context_chat.py

How It Works Internally

1. Non-Legal Query Detection

The system uses Mistral LLM to classify messages:

System Prompt: "Determine if this is legal-related or casual conversation"
Input: "Thank you!"
Output: "NON_LEGAL"

Casual categories:

  • Greetings (hi, hello, hey)
  • Thanks/gratitude
  • Goodbyes
  • Small talk

2. Independence Analysis

System Prompt: "Is the current message independent or dependent?"
Input:
  Previous: "I had a fight with my brother"
  Current: "He is making allegations"
Output: "DEPENDENT"

Independent criteria:

  • New topic
  • Self-contained
  • No pronouns referencing previous context

Dependent criteria:

  • Uses pronouns (he, she, it, this, that)
  • Continues previous topic
  • Follow-up questions

3. Conversation Summarization

System Prompt: "Combine conversation into one clear legal query"
Input:
  History: "I had a fight with my brother over property"
  Current: "He is making fake allegations"
Output: "My brother is making fake allegations against me in a property dispute. What are my rights?"

Benefits

  1. Better Context Understanding: Chatbot understands "he", "she", "it" references
  2. Efficient: Only fetches 5 recent messages (configurable)
  3. Cost-Effective: Skips RAG for non-legal queries
  4. Accurate: Uses lightweight LLM for classification before heavy RAG
  5. Flexible: Works with or without conversation_id

Troubleshooting

Issue: Context not being recognized

Solution: Check if conversation_id is being passed correctly. Without it, no context is fetched.

Issue: Non-legal queries being sent to RAG

Solution: The LLM classifier might need adjustment. Check module_a/context_analyzer.py system prompts.

Issue: Independent queries marked as dependent

Solution: Adjust temperature in is_independent_query() or refine the system prompt.

Issue: Slow response times

Solution:

  • Reduce context window size (default: 5 messages)
  • Use smaller Mistral model (mistral-tiny instead of mistral-small-latest)

Future Enhancements

Potential improvements:

  1. Caching: Cache LLM classification results for similar queries
  2. Adaptive Context: Dynamically adjust context window based on conversation complexity
  3. Multi-turn Summarization: Better handling of very long conversations
  4. Language Detection: Handle queries in multiple languages
  5. Intent Recognition: Detect user intent (question, clarification, new topic, etc.)

API Response Fields Reference

Field Type Description
summary string Brief answer to the query
key_point string Key legal point from sources
explanation string Detailed explanation
next_steps string Actionable advice
sources array Source documents used
query string The processed query
context_used boolean Whether conversation context was used
is_non_legal boolean Whether this is a casual/non-legal query
original_query string Original user query (if context used)
summarized_query string Summarized query sent to RAG (if context used)