Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.55.0
title: RAG Chatbot for Agentic AI eBook
emoji: π€
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.28.0
app_file: streamlit_app/app.py
pinned: false
π€ RAG Chatbot for Agentic AI eBook
A Retrieval-Augmented Generation (RAG) chatbot that answers questions strictly from the supplied Agentic AI eBook PDF. Built as an AI Engineer Internship assignment.
π Table of Contents
- Features
- Quick Start
- Setup
- Running the Application
- Deploying to Hugging Face Spaces
- Sample Queries
- How I Solved This
- Project Structure
- API Keys Required
β¨ Features
- π PDF Ingestion: Extract, clean, chunk, and embed PDF content
- π Semantic Search: Uses sentence-transformers for accurate retrieval
- π― Grounded Answers: Responses are strictly based on retrieved chunks (no hallucination)
- π Confidence Scores: Shows similarity-based confidence (0.0-1.0)
- π Dual Mode: LLM generation (with OpenAI) or extractive fallback (always works)
- π» Web UI: Clean Streamlit interface with chunk visualization
- βοΈ Deployable: Ready for Hugging Face Spaces
π Quick Start
# 1. Clone the repository
git clone <your-repo-url>
cd rag-eAgenticAI
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set environment variables
export PINECONE_API_KEY="your-pinecone-key"
# Optional: export OPENAI_API_KEY="your-openai-key"
# 5. Add your PDF
mkdir data
# Place Ebook-Agentic-AI.pdf in the data/ folder
# 6. Run ingestion
python app/ingest.py --pdf ./data/Ebook-Agentic-AI.pdf --index agentic-ai-ebook
# 7. Start the app
streamlit run streamlit_app/app.py
π§ Setup
Prerequisites
- Python 3.9 or higher
- pip (Python package manager)
- Pinecone account (free tier works)
- Optional: OpenAI API key for LLM-powered answers
Installation
- Create and activate virtual environment:
python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
π‘ Note for CPU-only machines: The default torch installation includes CUDA. For smaller download:
pip install torch --index-url https://download.pytorch.org/whl/cpu
- Set environment variables:
Create a .env file in the project root:
PINECONE_API_KEY=your-pinecone-api-key-here
PINECONE_INDEX=agentic-ai-ebook
OPENAI_API_KEY=your-openai-key-here # Optional
Or set them directly in your shell:
# Windows PowerShell
$env:PINECONE_API_KEY="your-key"
$env:OPENAI_API_KEY="your-key"
# macOS/Linux
export PINECONE_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
π Running the Application
Step 1: Ingest the PDF
Place your Ebook-Agentic-AI.pdf file in the data/ folder, then run:
# With Pinecone (recommended)
python app/ingest.py --pdf ./data/Ebook-Agentic-AI.pdf --index agentic-ai-ebook
# Local-only mode (no Pinecone needed)
python app/ingest.py --pdf ./data/Ebook-Agentic-AI.pdf --local-only
Ingestion options:
| Flag | Description | Default |
|---|---|---|
--pdf |
Path to PDF file | Required |
--index |
Pinecone index name | agentic-ai-ebook |
--namespace |
Pinecone namespace | agentic-ai |
--chunk-size |
Tokens per chunk | 500 |
--overlap |
Chunk overlap in tokens | 50 |
--local-only |
Skip Pinecone, save locally | False |
--output-dir |
Output directory | ./data |
Step 2: Run the Streamlit App
streamlit run streamlit_app/app.py
The app will open in your browser at http://localhost:8501.
Step 3: Configure in the UI
- Enter your Pinecone API key in the sidebar (if not set via env var)
- Optionally add OpenAI API key for LLM-powered answers
- Adjust retrieval settings (top_k, etc.)
- Click "Initialize Pipeline"
- Start asking questions!
βοΈ Deploying to Hugging Face Spaces
Method 1: Git-based Deployment
Create a new Space on huggingface.co/spaces
- Select Streamlit as the SDK
- Choose a name for your Space
Clone and push:
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME
# Copy all files from this repo
git add .
git commit -m "Initial deployment"
git push
Set secrets in Space Settings β Repository secrets:
PINECONE_API_KEY: Your Pinecone keyPINECONE_INDEX: Your index nameOPENAI_API_KEY: (Optional) Your OpenAI key
Important: Ensure your
README.mdhas the HF Spaces header:
---
title: Agentic AI eBook Chatbot
emoji: π€
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: "1.28.0"
app_file: streamlit_app/app.py
pinned: false
---
Method 2: Manual Upload
- Create a new Streamlit Space on Hugging Face
- Upload all files via the web interface
- Set secrets in Space Settings
π Reference: Hugging Face Spaces - Streamlit Docs
π¬ Sample Queries
Test the chatbot with these example questions:
| # | Query | Expected Retrieval |
|---|---|---|
| 1 | "What is the definition of 'agentic AI' described in the eBook?" | Pages discussing agentic AI definition |
| 2 | "List the three risks of agentic systems the eBook mentions." | Pages about risks/challenges |
| 3 | "What are the recommended safeguards for deploying agentic AI?" | Pages about safeguards/best practices |
| 4 | "How does the eBook distinguish between autonomous agents and traditional automation?" | Comparison sections |
| 5 | "What future research directions does the eBook propose?" | Conclusion/future work pages |
| 6 | "Summarize the eBook's conclusion in one paragraph." | Conclusion chapter |
Expected Response Format
{
"final_answer": "According to the eBook, agentic AI is defined as...",
"retrieved_chunks": [
{
"id": "pdfpage_12_chunk_0",
"page": 12,
"text": "Agentic AI represents a paradigm shift...",
"score": 0.92
}
],
"confidence": 0.92
}
π§ How I Solved This
Chunking Strategy
I chose a 500-token chunk size with 50-token overlap for several reasons:
- 500 tokens is large enough to capture meaningful context
- Overlap ensures information at chunk boundaries isn't lost
- Token-based chunking (via tiktoken) is more consistent than character-based
The chunk ID format pdfpage_{page}_chunk_{index} makes it easy to trace answers back to source pages for verification.
Embedding Choice
I used sentence-transformers/all-MiniLM-L6-v2 because:
- It's completely free (no API costs)
- Works offline on CPU
- 384-dimension vectors are efficient for storage
- Quality is good enough for document retrieval
Trade-off: OpenAI's ada-002 would give better quality, but MiniLM keeps the project accessible without paid APIs.
Extractive Fallback
The extractive mode exists because:
- Not everyone has OpenAI API access
- It ensures the app always works, even offline
- Graders can test the core RAG functionality without API costs
- It demonstrates that the retrieval pipeline works correctly
When no LLM key is provided, the system returns the most relevant chunks directly with minimal formatting - this is honest about what it's doing and still provides useful answers.
Grounding Enforcement
To prevent hallucination, the LLM system prompt explicitly instructs:
"Use only the text between markers. Do not invent facts. If the answer isn't in the excerpts, say 'I could not find a supported answer in the document.'"
This keeps the model honest about its knowledge boundaries.
π Project Structure
rag-eAgenticAI/
βββ app/
β βββ __init__.py # Package initialization
β βββ ingest.py # PDF ingestion pipeline
β βββ vectorstore.py # Pinecone wrapper
β βββ rag_pipeline.py # LangGraph RAG pipeline
β βββ utils.py # Helper functions
β
βββ streamlit_app/
β βββ app.py # Streamlit UI
β βββ assets/ # Static files
β
βββ samples/
β βββ sample_queries.txt # Test questions
β βββ expected_responses.md # Expected output format
β
βββ infra/
β βββ hf_space_readme_template.md
β
βββ data/ # PDF and chunks (gitignored)
β
βββ README.md # This file
βββ architecture.md # Architecture docs
βββ requirements.txt # Dependencies
βββ quick_test.py # Validation script
βββ LICENSE # MIT License
βββ .gitignore
π API Keys Required
| Service | Required | How to Get | Purpose |
|---|---|---|---|
| Pinecone | Yes* | pinecone.io (free tier) | Vector storage & retrieval |
| OpenAI | No | platform.openai.com | LLM answer generation |
*You can run in --local-only mode without Pinecone for testing.
Getting Pinecone API Key
- Create account at pinecone.io
- Go to API Keys in the console
- Create a new key
- Copy and set as
PINECONE_API_KEY
Getting OpenAI API Key (Optional)
- Create account at platform.openai.com
- Go to API Keys
- Create a new secret key
- Copy and set as
OPENAI_API_KEY
π§ͺ Testing
Run the quick test script to verify everything works:
python quick_test.py
This will:
- Test utility functions (chunking, scoring)
- Test the RAG pipeline with a sample query
- Print the response in the expected JSON format
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- LangGraph for RAG orchestration patterns
- Pinecone for vector database
- Sentence-Transformers for embeddings
- Streamlit for the web framework
- Hugging Face for hosting
Built for AI Engineer Intern Assignment - Answers strictly grounded in the Agentic AI eBook