Anandharajan commited on
Commit
3b1fd68
·
1 Parent(s): 76a9db2

Remove presentation doc from Space

Browse files
Files changed (1) hide show
  1. PRESENTATION.md +0 -96
PRESENTATION.md DELETED
@@ -1,96 +0,0 @@
1
- # RAG LangGraph Chatbot – Research Briefing
2
-
3
- This document summarizes the implemented project, highlights differences versus the reference guide (“RAG-Based Chatbot with LangGraph Agent Integration.pdf”), and provides detailed notes for a technical presentation to a research audience.
4
-
5
- ## 1. Concept & Scope
6
- - Goal: A Retrieval-Augmented Generation (RAG) chatbot that ingests a PDF, builds a vector store, retrieves context, and answers via a LangGraph-orchestrated flow exposed through a Gradio UI.
7
- - Contrast with the PDF guide: The guide is CLI-first and modular (per-stage scripts); this project consolidates the flow into a single Gradio app while still following best practices (chunking, embeddings, FAISS, LangGraph orchestration).
8
- - Design choice: Favor a minimal, reproducible web UI for demos and HF Spaces deployment rather than a set of separate CLI tools.
9
-
10
- ## 2. Key Differences vs. the Reference PDF
11
- - **Interface**: PDF describes CLI pipelines; this repo uses a Gradio UI with one-click ingest + chat.
12
- - **Orchestration**: Minimal LangGraph (retrieve → generate) instead of a richer tool/agent graph.
13
- - **Models**: Default router model `meta-llama/Meta-Llama-3-8B-Instruct` (HF router) with local fallback `distilgpt2`; the PDF mentions OpenAI/local LLM options (Ollama, vLLM) not included here.
14
- - **Vector store**: FAISS only (auto-create/load on upload); PDF discusses FAISS/Chroma and manual CLI scripts.
15
- - **Ingestion**: PyPDFLoader + RecursiveCharacterTextSplitter (chunk_size=1000, overlap=200) baked into the app.
16
- - **Deployment**: Ready for Hugging Face Spaces (gradio sdk, space.yaml, runtime.txt); PDF assumes local CLI execution.
17
- - **Validation**: Simple test in `tests/test_pipeline.py`; PDF suggests broader per-stage validation scripts.
18
-
19
- ## 3. Code Structure (files and roles)
20
- - `app.py`: Gradio Blocks UI; handles upload, ingestion, vector store creation/loading, LangGraph agent invocation, HF token/model inputs.
21
- - `src/config.py`: Paths, chunking params, embedding model, default HF router/local model IDs, temperature.
22
- - `src/ingestion.py`: PDF loading (PyPDFLoader) and chunking (RecursiveCharacterTextSplitter).
23
- - `src/vectorstore.py`: Embeddings via sentence-transformers/all-MiniLM-L6-v2; FAISS create/load.
24
- - `src/agent.py`: LangGraph graph (retrieve node → generate node); HF router call with local fallback; prompt construction; context truncation for small local models.
25
- - `src/rag_tool.py`: Legacy retriever tool helper (not wired into the app).
26
- - `tests/test_pipeline.py`: Basic ingestion + FAISS save/load sanity check.
27
- - Deployment metadata: `.gitignore`, `requirements.txt`, `runtime.txt`, `space.yaml`, README, PRESENTATION.md.
28
-
29
- ## 4. Libraries Used (requirements.txt, definitions & rationale)
30
- - `langchain`, `langchain-community`, `langchain-text-splitters`, `langchain-huggingface`: Retrieval, prompt building, loaders, embeddings helpers.
31
- - `langgraph`: Graph-based orchestration (stateful retrieve → generate flow).
32
- - `gradio`: Web UI for chat and upload.
33
- - `python-dotenv`: Load env vars for tokens/model IDs.
34
- - `sentence-transformers`: Embedding model (all-MiniLM-L6-v2).
35
- - `faiss-cpu`: Local vector index (fast similarity search).
36
- - `pypdf`: PDF extraction for ingestion.
37
- - `pydantic`: Settings/data validation utilities (dependency of LC ecosystem).
38
- - `huggingface-hub`: Model hub interactions; router requests.
39
- - `transformers`: HF router/local generation fallback pipeline.
40
-
41
- ## 5. End-to-End Flow (App)
42
- 1) Upload PDF (or reuse existing `data/source.pdf`).
43
- 2) Ingest → chunk (1000/200) → embed (all-MiniLM-L6-v2) → FAISS save (`data/faiss_index`).
44
- 3) Build LangGraph agent (retriever node, generator node).
45
- 4) Chat: user message → retrieve top docs → prompt with context → generate via HF router (token+model) else local fallback → return answer.
46
-
47
- ## 6. CLI Notes (for parity with PDF guide)
48
- While the app is UI-first, equivalent stages can be executed manually:
49
- ```bash
50
- # (Optional) venv, install
51
- python -m venv .venv && .venv/Scripts/activate # Windows; use bin/activate on *nix
52
- pip install -r requirements.txt
53
-
54
- # Ingest manually (scripted example)
55
- python - <<'PY'
56
- from src.ingestion import ingest_file
57
- from src.vectorstore import create_vectorstore
58
- from src.config import PDF_PATH
59
- chunks = ingest_file(str(PDF_PATH))
60
- _ = create_vectorstore(chunks)
61
- print("Chunks:", len(chunks))
62
- PY
63
-
64
- # Run app
65
- python app.py
66
- ```
67
-
68
- ## 7. Feasibility & Trade-offs
69
- - **Router-first**: Best quality requires HF token + router model; local fallback is lightweight and will be weaker—set `LOCAL_MODEL_ID` to a stronger local model if needed (ensure it fits resources).
70
- - **Minimal LangGraph**: Only two nodes (retrieve, generate). Extensible to add tools/checkpoints but kept lean for deployment stability.
71
- - **Resource footprint**: Embedding model is small; FAISS is local and fast; defaults avoid GPU needs for embeddings, but generation quality depends on chosen LLM.
72
- - **HF Spaces**: Configured with `space.yaml`/`runtime.txt` (Python 3.10) and Gradio sdk 4.44.1; tested build success.
73
-
74
- ## 8. Best Practices Applied
75
- - Clear defaults with env override (`HF_MODEL_ID`, `LOCAL_MODEL_ID`, `HUGGINGFACEHUB_API_TOKEN`).
76
- - Safe fallback: router errors degrade to local model with a visible note.
77
- - Prompt truncation for local models to avoid context overrun errors.
78
- - Persisted vector store in `data/faiss_index`; `data/` git-ignored; creation of `DATA_DIR` on startup.
79
- - Deployment metadata for HF Spaces; `.gitignore` to keep repo clean.
80
- - Basic test coverage for ingestion/index creation.
81
- - Minimal, readable LangGraph wiring for transparency.
82
-
83
- ## 9. Recommendations (Future Enhancements)
84
- - Add richer LangGraph tools (citation, summarization, multi-hop retrieval).
85
- - Plug in stronger local models (e.g., via Ollama/vLLM) with gated selection.
86
- - Expose per-turn retrieval metadata in UI (sources/attribution).
87
- - Add per-stage CLI scripts (extract/chunk/embed/query) to mirror the PDF guide.
88
- - Add evaluation harness (retrieval precision, answer faithfulness).
89
-
90
- ## 10. Quick Presentation Outline
91
- 1) Motivation: RAG for grounded answers; LangGraph for explicit flow; Gradio for usability.
92
- 2) Architecture: Ingest → Embed → FAISS → LangGraph (retrieve → generate) → HF router/local.
93
- 3) Implementation highlights: `ingestion.py`, `vectorstore.py`, `agent.py`, `app.py`.
94
- 4) Deployment: HF Spaces config; env vars; defaults and fallbacks.
95
- 5) Gaps vs. guide: UI-first vs CLI-first; minimal graph; single vector store; limited local LLM support.
96
- 6) Next steps: stronger local models, richer tools, evaluation, attribution in UI.