Spaces:
Sleeping
Sleeping
Commit ·
3b1fd68
1
Parent(s): 76a9db2
Remove presentation doc from Space
Browse files- PRESENTATION.md +0 -96
PRESENTATION.md
DELETED
|
@@ -1,96 +0,0 @@
|
|
| 1 |
-
# RAG LangGraph Chatbot – Research Briefing
|
| 2 |
-
|
| 3 |
-
This document summarizes the implemented project, highlights differences versus the reference guide (“RAG-Based Chatbot with LangGraph Agent Integration.pdf”), and provides detailed notes for a technical presentation to a research audience.
|
| 4 |
-
|
| 5 |
-
## 1. Concept & Scope
|
| 6 |
-
- Goal: A Retrieval-Augmented Generation (RAG) chatbot that ingests a PDF, builds a vector store, retrieves context, and answers via a LangGraph-orchestrated flow exposed through a Gradio UI.
|
| 7 |
-
- Contrast with the PDF guide: The guide is CLI-first and modular (per-stage scripts); this project consolidates the flow into a single Gradio app while still following best practices (chunking, embeddings, FAISS, LangGraph orchestration).
|
| 8 |
-
- Design choice: Favor a minimal, reproducible web UI for demos and HF Spaces deployment rather than a set of separate CLI tools.
|
| 9 |
-
|
| 10 |
-
## 2. Key Differences vs. the Reference PDF
|
| 11 |
-
- **Interface**: PDF describes CLI pipelines; this repo uses a Gradio UI with one-click ingest + chat.
|
| 12 |
-
- **Orchestration**: Minimal LangGraph (retrieve → generate) instead of a richer tool/agent graph.
|
| 13 |
-
- **Models**: Default router model `meta-llama/Meta-Llama-3-8B-Instruct` (HF router) with local fallback `distilgpt2`; the PDF mentions OpenAI/local LLM options (Ollama, vLLM) not included here.
|
| 14 |
-
- **Vector store**: FAISS only (auto-create/load on upload); PDF discusses FAISS/Chroma and manual CLI scripts.
|
| 15 |
-
- **Ingestion**: PyPDFLoader + RecursiveCharacterTextSplitter (chunk_size=1000, overlap=200) baked into the app.
|
| 16 |
-
- **Deployment**: Ready for Hugging Face Spaces (gradio sdk, space.yaml, runtime.txt); PDF assumes local CLI execution.
|
| 17 |
-
- **Validation**: Simple test in `tests/test_pipeline.py`; PDF suggests broader per-stage validation scripts.
|
| 18 |
-
|
| 19 |
-
## 3. Code Structure (files and roles)
|
| 20 |
-
- `app.py`: Gradio Blocks UI; handles upload, ingestion, vector store creation/loading, LangGraph agent invocation, HF token/model inputs.
|
| 21 |
-
- `src/config.py`: Paths, chunking params, embedding model, default HF router/local model IDs, temperature.
|
| 22 |
-
- `src/ingestion.py`: PDF loading (PyPDFLoader) and chunking (RecursiveCharacterTextSplitter).
|
| 23 |
-
- `src/vectorstore.py`: Embeddings via sentence-transformers/all-MiniLM-L6-v2; FAISS create/load.
|
| 24 |
-
- `src/agent.py`: LangGraph graph (retrieve node → generate node); HF router call with local fallback; prompt construction; context truncation for small local models.
|
| 25 |
-
- `src/rag_tool.py`: Legacy retriever tool helper (not wired into the app).
|
| 26 |
-
- `tests/test_pipeline.py`: Basic ingestion + FAISS save/load sanity check.
|
| 27 |
-
- Deployment metadata: `.gitignore`, `requirements.txt`, `runtime.txt`, `space.yaml`, README, PRESENTATION.md.
|
| 28 |
-
|
| 29 |
-
## 4. Libraries Used (requirements.txt, definitions & rationale)
|
| 30 |
-
- `langchain`, `langchain-community`, `langchain-text-splitters`, `langchain-huggingface`: Retrieval, prompt building, loaders, embeddings helpers.
|
| 31 |
-
- `langgraph`: Graph-based orchestration (stateful retrieve → generate flow).
|
| 32 |
-
- `gradio`: Web UI for chat and upload.
|
| 33 |
-
- `python-dotenv`: Load env vars for tokens/model IDs.
|
| 34 |
-
- `sentence-transformers`: Embedding model (all-MiniLM-L6-v2).
|
| 35 |
-
- `faiss-cpu`: Local vector index (fast similarity search).
|
| 36 |
-
- `pypdf`: PDF extraction for ingestion.
|
| 37 |
-
- `pydantic`: Settings/data validation utilities (dependency of LC ecosystem).
|
| 38 |
-
- `huggingface-hub`: Model hub interactions; router requests.
|
| 39 |
-
- `transformers`: HF router/local generation fallback pipeline.
|
| 40 |
-
|
| 41 |
-
## 5. End-to-End Flow (App)
|
| 42 |
-
1) Upload PDF (or reuse existing `data/source.pdf`).
|
| 43 |
-
2) Ingest → chunk (1000/200) → embed (all-MiniLM-L6-v2) → FAISS save (`data/faiss_index`).
|
| 44 |
-
3) Build LangGraph agent (retriever node, generator node).
|
| 45 |
-
4) Chat: user message → retrieve top docs → prompt with context → generate via HF router (token+model) else local fallback → return answer.
|
| 46 |
-
|
| 47 |
-
## 6. CLI Notes (for parity with PDF guide)
|
| 48 |
-
While the app is UI-first, equivalent stages can be executed manually:
|
| 49 |
-
```bash
|
| 50 |
-
# (Optional) venv, install
|
| 51 |
-
python -m venv .venv && .venv/Scripts/activate # Windows; use bin/activate on *nix
|
| 52 |
-
pip install -r requirements.txt
|
| 53 |
-
|
| 54 |
-
# Ingest manually (scripted example)
|
| 55 |
-
python - <<'PY'
|
| 56 |
-
from src.ingestion import ingest_file
|
| 57 |
-
from src.vectorstore import create_vectorstore
|
| 58 |
-
from src.config import PDF_PATH
|
| 59 |
-
chunks = ingest_file(str(PDF_PATH))
|
| 60 |
-
_ = create_vectorstore(chunks)
|
| 61 |
-
print("Chunks:", len(chunks))
|
| 62 |
-
PY
|
| 63 |
-
|
| 64 |
-
# Run app
|
| 65 |
-
python app.py
|
| 66 |
-
```
|
| 67 |
-
|
| 68 |
-
## 7. Feasibility & Trade-offs
|
| 69 |
-
- **Router-first**: Best quality requires HF token + router model; local fallback is lightweight and will be weaker—set `LOCAL_MODEL_ID` to a stronger local model if needed (ensure it fits resources).
|
| 70 |
-
- **Minimal LangGraph**: Only two nodes (retrieve, generate). Extensible to add tools/checkpoints but kept lean for deployment stability.
|
| 71 |
-
- **Resource footprint**: Embedding model is small; FAISS is local and fast; defaults avoid GPU needs for embeddings, but generation quality depends on chosen LLM.
|
| 72 |
-
- **HF Spaces**: Configured with `space.yaml`/`runtime.txt` (Python 3.10) and Gradio sdk 4.44.1; tested build success.
|
| 73 |
-
|
| 74 |
-
## 8. Best Practices Applied
|
| 75 |
-
- Clear defaults with env override (`HF_MODEL_ID`, `LOCAL_MODEL_ID`, `HUGGINGFACEHUB_API_TOKEN`).
|
| 76 |
-
- Safe fallback: router errors degrade to local model with a visible note.
|
| 77 |
-
- Prompt truncation for local models to avoid context overrun errors.
|
| 78 |
-
- Persisted vector store in `data/faiss_index`; `data/` git-ignored; creation of `DATA_DIR` on startup.
|
| 79 |
-
- Deployment metadata for HF Spaces; `.gitignore` to keep repo clean.
|
| 80 |
-
- Basic test coverage for ingestion/index creation.
|
| 81 |
-
- Minimal, readable LangGraph wiring for transparency.
|
| 82 |
-
|
| 83 |
-
## 9. Recommendations (Future Enhancements)
|
| 84 |
-
- Add richer LangGraph tools (citation, summarization, multi-hop retrieval).
|
| 85 |
-
- Plug in stronger local models (e.g., via Ollama/vLLM) with gated selection.
|
| 86 |
-
- Expose per-turn retrieval metadata in UI (sources/attribution).
|
| 87 |
-
- Add per-stage CLI scripts (extract/chunk/embed/query) to mirror the PDF guide.
|
| 88 |
-
- Add evaluation harness (retrieval precision, answer faithfulness).
|
| 89 |
-
|
| 90 |
-
## 10. Quick Presentation Outline
|
| 91 |
-
1) Motivation: RAG for grounded answers; LangGraph for explicit flow; Gradio for usability.
|
| 92 |
-
2) Architecture: Ingest → Embed → FAISS → LangGraph (retrieve → generate) → HF router/local.
|
| 93 |
-
3) Implementation highlights: `ingestion.py`, `vectorstore.py`, `agent.py`, `app.py`.
|
| 94 |
-
4) Deployment: HF Spaces config; env vars; defaults and fallbacks.
|
| 95 |
-
5) Gaps vs. guide: UI-first vs CLI-first; minimal graph; single vector store; limited local LLM support.
|
| 96 |
-
6) Next steps: stronger local models, richer tools, evaluation, attribution in UI.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|