Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.55.0
title: LangGraph RAG Q&A Agent
emoji: π€
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.35.0
app_file: app.py
pinned: false
license: mit
π€ LangGraph RAG Q&A Agent
Next-Generation AI Assistant with Real-Time Analytics & Dynamic Dashboards
A production-ready Retrieval-Augmented Generation (RAG) system built with LangGraph, featuring a sophisticated 4-node workflow (Plan β Retrieve β Answer β Reflect) with comprehensive evaluation metrics and a premium Streamlit UI.
π Table of Contents
- Overview
- Features
- Architecture
- Installation
- Usage
- Project Structure
- Configuration
- Evaluation Metrics
- Bonus Features
- Challenges & Solutions
- Requirements
- Contributing
- License
π― Overview
This project implements a basic AI agent using LangGraph that can answer questions from a knowledge base using RAG (Retrieval-Augmented Generation). The system demonstrates advanced AI agent workflows, RAG pipeline design, and LangGraph basics through a multi-node architecture with self-reflection capabilities.
Key Objectives
β
Test understanding of AI agent workflows
β
Demonstrate RAG pipeline design
β
Implement LangGraph basics with 4+ nodes
β
Build reflection/validation mechanisms
β
Create production-ready code with comprehensive documentation
β¨ Features
Core Functionality
- π§ LangGraph Workflow - 4-node agent architecture (Plan β Retrieve β Answer β Reflect)
- π RAG Pipeline - Retrieval-Augmented Generation with ChromaDB vector store
- π Self-Reflection - Automatic answer quality evaluation and regeneration
- π€ Multi-LLM Support - OpenAI (GPT-3.5/4) and Hugging Face (Flan-T5, Mistral)
- πΎ Vector Database - ChromaDB for efficient semantic search
- π¨ Premium UI - Beautiful Blue & Black themed Streamlit interface
Advanced Features (Bonus Points)
- π Dynamic Dashboards - Real-time analytics with Plotly visualizations
- π Evaluation Metrics - ROUGE, BERTScore, Context Relevance
- π― Interactive UI - Streamlit-based question answering interface
- π Comprehensive Logging - Step-by-step workflow visibility
- π Context Tracing - Full transparency of retrieved documents
- π₯ Export Reports - Download evaluation results as JSON
ποΈ Architecture
LangGraph Workflow
The agent implements a 4-node workflow as required:
graph LR
A[User Query] --> B[Plan Node]
B --> C{Needs Retrieval?}
C -->|Yes| D[Retrieve Node]
C -->|No| E[Answer Node]
D --> E
E --> F[Reflect Node]
F --> G{Quality OK?}
G -->|Accept| H[Final Answer]
G -->|Reject| E
G -->|Max Iterations| H
Node Descriptions
Plan Node π
- Analyzes user query
- Determines if retrieval is needed
- Creates execution strategy
Retrieve Node π
- Performs RAG using ChromaDB
- Retrieves top-k relevant documents
- Uses semantic search with embeddings
Answer Node π¬
- Generates response using LLM
- Incorporates retrieved context
- Handles regeneration with feedback
Reflect Node π
- Evaluates answer quality
- Checks relevance and completeness
- Triggers regeneration if needed
Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Agent Framework | LangGraph | Workflow orchestration |
| RAG Framework | LangChain | Retrieval + Generation |
| Vector Database | ChromaDB | Semantic search |
| Embeddings | Sentence Transformers | Vector creation |
| LLM | OpenAI / Hugging Face | Answer generation |
| UI | Streamlit | Interactive interface |
| Visualization | Plotly | Dynamic charts |
| Evaluation | ROUGE, BERTScore | Quality metrics |
π Installation
Prerequisites
- Python 3.9 or higher
- pip package manager
- 4GB+ RAM recommended
- (Optional) NVIDIA GPU for faster inference
Step 1: Clone Repository
git clone https://github.com/yourusername/langgraph-rag-agent.git
cd langgraph-rag-agent
Step 2: Create Virtual Environment
# Windows
python -m venv venv
venv\Scripts\activate
# macOS/Linux
python3 -m venv venv
source venv/bin/activate
Step 3: Install Dependencies
pip install -r requirements.txt
Step 4: Configure Environment
Create a .env file in the project root:
# Copy template
cp .env.example .env
# Edit with your credentials
notepad .env # Windows
nano .env # macOS/Linux
Required environment variables:
# LLM Provider (choose one)
LLM_PROVIDER=huggingface
# or
LLM_PROVIDER=openai
# OpenAI Configuration (if using OpenAI)
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-3.5-turbo
# Hugging Face Configuration (if using Hugging Face)
HUGGINGFACE_API_TOKEN=your_hf_token_here
HUGGINGFACE_MODEL=google/flan-t5-large
# Embedding Model
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# Vector Database
CHROMA_PERSIST_DIR=./chroma_db
CHROMA_COLLECTION_NAME=rag_knowledge_base
# RAG Configuration
CHUNK_SIZE=500
CHUNK_OVERLAP=50
TOP_K_RETRIEVAL=3
# Reflection Settings
USE_LLM_REFLECTION=false
MAX_REFLECTION_ITERATIONS=2
Step 5: Prepare Knowledge Base
Place your text files in the data/ directory:
data/
βββ artificial_intelligence.txt
βββ machine_learning.txt
βββ python_programming.txt
βββ cloud_computing.txt
βββ databases.txt
π» Usage
Option 1: Streamlit UI (Recommended)
cd src
streamlit run ui_app.py
Then open your browser to: http://localhost:8501
Option 2: Command Line
cd src
python main.py
Interactive mode:
python main.py --mode interactive
Sample queries:
python main.py --mode sample
Option 3: Jupyter Notebook
jupyter notebook notebooks/rag_demo.ipynb
Example Usage
from agent_workflow import create_rag_agent
from rag_pipeline import RAGPipeline
from llm_utils import create_llm_handler
from reflection import create_reflection_evaluator
# Initialize components
rag_pipeline = RAGPipeline(
data_directory="./data",
collection_name="rag_knowledge_base",
persist_directory="./chroma_db"
)
rag_pipeline.build_index()
llm_handler = create_llm_handler(
provider="huggingface",
model_name="google/flan-t5-large"
)
reflection_evaluator = create_reflection_evaluator(
llm_handler=llm_handler,
use_llm_reflection=False
)
agent = create_rag_agent(
rag_pipeline=rag_pipeline,
llm_handler=llm_handler,
reflection_evaluator=reflection_evaluator
)
# Ask a question
result = agent.query("What is machine learning?")
print(result['final_response'])
π Project Structure
langgraph-rag-agent/
β
βββ data/ # Knowledge base (text files)
β βββ artificial_intelligence.txt
β βββ machine_learning.txt
β βββ python_programming.txt
β βββ cloud_computing.txt
β βββ databases.txt
β
βββ src/ # Source code
β βββ agent_workflow.py # LangGraph agent implementation
β βββ rag_pipeline.py # RAG pipeline with ChromaDB
β βββ llm_utils.py # LLM handlers (OpenAI/HF)
β βββ reflection.py # Reflection evaluator
β βββ evaluation.py # Metrics (ROUGE, BERTScore)
β βββ ui_app.py # Streamlit UI (Premium)
β βββ main.py # CLI interface
β
βββ notebooks/ # Jupyter notebooks
β βββ rag_demo.ipynb # Interactive demo
β
βββ chroma_db/ # ChromaDB vector store (auto-created)
βββ models_cache/ # HuggingFace model cache
β
βββ .env # Environment variables
βββ .env.example # Environment template
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ LICENSE # MIT License
βοΈ Configuration
LLM Provider Selection
Option 1: Hugging Face (Free, Local)
LLM_PROVIDER=huggingface
HUGGINGFACE_MODEL=google/flan-t5-large
Supported models:
google/flan-t5-small(300MB, fast)google/flan-t5-base(850MB, balanced)google/flan-t5-large(3GB, best quality)mistralai/Mistral-7B-Instruct-v0.2(14GB, advanced)
Option 2: OpenAI (Paid, Cloud)
LLM_PROVIDER=openai
OPENAI_API_KEY=your_api_key
OPENAI_MODEL=gpt-3.5-turbo
Supported models:
gpt-3.5-turbo(fast, affordable)gpt-4(best quality, expensive)gpt-4-turbo(balanced)
Embedding Models
The system uses Sentence Transformers for embeddings:
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
Alternatives:
all-mpnet-base-v2(higher quality, slower)all-MiniLM-L12-v2(balanced)
RAG Parameters
CHUNK_SIZE=500 # Characters per chunk
CHUNK_OVERLAP=50 # Overlap between chunks
TOP_K_RETRIEVAL=3 # Number of chunks to retrieve
Reflection Settings
USE_LLM_REFLECTION=false # Use LLM for reflection (slower, better)
MAX_REFLECTION_ITERATIONS=2 # Max regeneration attempts
π Evaluation Metrics
The system includes comprehensive evaluation as a bonus feature:
ROUGE Scores
Measures n-gram overlap between generated and reference answers:
- ROUGE-1: Unigram overlap
- ROUGE-2: Bigram overlap
- ROUGE-L: Longest common subsequence
BERTScore
Measures semantic similarity using contextual embeddings:
- Precision: How much of the generated text is relevant
- Recall: How much of the reference is covered
- F1: Harmonic mean of precision and recall
Context Relevance
Measures how well the answer uses retrieved context:
- Term frequency overlap
- Semantic alignment
- Coverage score
Reflection Scores
Internal quality assessment:
- Relevance: Relevant / Partially Relevant / Irrelevant
- Completeness: Answer completeness check
- Confidence: Model confidence estimation
π Bonus Features
β Streamlit UI
- Interactive question answering
- Real-time analytics dashboards
- Dynamic visualizations (gauges, bar charts, radar charts)
- Premium Blue & Black theme
β Evaluation Logging
- ROUGE metrics for quality assessment
- BERTScore for semantic similarity
- Context relevance scoring
- JSON export of results
β Project Report
See REPORT.md for:
- How the agent works
- Challenges faced during development
- Design decisions and trade-offs
- Performance analysis
β Code Quality
- Type hints throughout codebase
- Comprehensive docstrings
- Error handling and logging
- Modular architecture
- Clean code principles
π οΈ Challenges & Solutions
Challenge 1: Model Selection
Problem: Balancing answer quality with inference speed.
Solution:
- Implemented multi-LLM support
- Defaulted to
flan-t5-large(good balance) - Allow users to switch models via config
Challenge 2: Reflection Loop
Problem: Preventing infinite regeneration loops.
Solution:
- Implemented
MAX_REFLECTION_ITERATIONS - Added heuristic-based reflection (fast)
- Optional LLM-based reflection (accurate)
Challenge 3: Vector Store Persistence
Problem: Rebuilding index on every restart.
Solution:
- ChromaDB persistent storage
- Check for existing collections
- Optional force rebuild flag
Challenge 4: UI Responsiveness
Problem: Long wait times during inference.
Solution:
- Added loading spinners
- Terminal output visibility
- Progress indicators
- Caching with
@st.cache_resource
Challenge 5: Evaluation Metrics
Problem: BERTScore requires reference answers.
Solution:
- Made reference answer optional
- Added heuristic metrics (length, coverage)
- Comprehensive reflection analysis
π¦ Requirements
Core Dependencies
# LangGraph & LangChain
langgraph==0.2.28
langchain==0.2.16
langchain-community==0.2.16
langchain-core==0.2.38
# Vector Database
chromadb==0.5.0
sentence-transformers==2.7.0
# LLM Providers
openai==1.35.0
huggingface-hub==0.23.4
transformers==4.41.2
torch>=2.0.0
# Utilities
python-dotenv==1.0.1
pydantic==2.7.4
# Evaluation
rouge-score==0.1.2
bert-score==0.3.13
# Visualization
plotly==5.18.0
numpy>=1.24.0
# UI
streamlit==1.35.0
# Development
jupyter==1.0.0
ipykernel==6.29.4
System Requirements
- OS: Windows 10+, macOS 10.14+, Linux
- Python: 3.9 or higher
- RAM: 4GB minimum, 8GB recommended
- Storage: 5GB for models cache
- GPU: Optional (NVIDIA CUDA for faster inference)
π How It Works
Step 1: Query Planning
The agent analyzes the query to determine if retrieval is needed:
Query: "What is machine learning?"
Plan: This is a factual question requiring knowledge base retrieval.
Step 2: Context Retrieval
ChromaDB performs semantic search:
Top 3 relevant chunks:
1. "Machine learning is a subset of AI..." (similarity: 0.85)
2. "ML algorithms learn from data..." (similarity: 0.78)
3. "Types of ML: supervised, unsupervised..." (similarity: 0.72)
Step 3: Answer Generation
LLM generates answer using context:
Answer: "Machine learning is a subset of artificial intelligence
that enables systems to learn from data without explicit programming..."
Step 4: Reflection & Validation
Agent evaluates answer quality:
Relevance: Relevant
Quality Score: 0.90/1.0
Recommendation: ACCEPT
π Performance
| Metric | Value |
|---|---|
| Knowledge Base | 301 chunks from 5 documents |
| Embedding Dimension | 384 (MiniLM-L6-v2) |
| Average Query Time | 15-25 seconds |
| Retrieval Accuracy | ~85% relevance |
| Answer Quality (ROUGE-L) | 0.65-0.85 |
| Context Usage | 3 chunks per query |
π¬ Testing
Run the agent with sample queries:
python main.py --mode sample
Example queries:
- "What is machine learning?"
- "Explain Python programming"
- "What are NoSQL databases?"
- "Tell me about cloud computing"
- "What is deep learning?"
π€ Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- LangChain for the amazing RAG framework
- LangGraph for workflow orchestration
- ChromaDB for vector storage
- Hugging Face for open-source models
- Streamlit for the beautiful UI framework
π Contact
Author: Harsh Mishra
Date: 2025-11-06
Email: harshmishra1132@gmail.com
GitHub: @HarshMishra-Git
π― Task Completion Checklist
Core Requirements β
- Accept user questions
- Retrieve relevant information from text dataset
- Use LLM (OpenAI/Gemini/Claude/Groq/HuggingFace) to generate answers
- Show reflection/validation step
- Four LangGraph nodes: plan, retrieve, answer, reflect
Framework Requirements β
- LangGraph for agent workflow
- LangChain for RAG (retrieval + generation)
- ChromaDB for vector storage
- Hugging Face embeddings
Code Requirements β
- Runs locally (Python script β / Jupyter notebook β )
- Includes requirements.txt
- Logging/print statements for each step
- Well-documented code
Bonus Points β
- Streamlit UI for interactive questions
- Evaluation code (ROUGE/BERTScore)
- Short report (1-2 paragraphs) describing agent and challenges
Submission β
- Python script (.py files)
- Jupyter notebook (optional, included)
- README.md with setup steps and approach
β Star this repository if you found it helpful!
Made with β€οΈ using LangGraph, LangChain, and Streamlit