Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.54.0
title: Agentic RAG (Agentic DP + AIMA + MCP)
author: O.O
sdk: streamlit
app_file: app.py
Agentic RAG (FAISS + SentenceTransformers + Hugging Face LLM)
A Streamlit UI that answers questions over a local RAG corpus with a retrieval-only baseline. It indexes chunk files with FAISS and retrieves across:
- Agentic Design Patterns (doc_id:
agentic_design_patterns) - AIMA (doc_id:
aima) - MCP markdowns (doc_id prefix:
mcp::) - Articles (doc_id prefix:
article::)
Quick start (local)
0) Prerequisites
- Python 3.11+
- A Hugging Face access token (for the hosted LLM)
- Network access for article ingestion and MCP refresh scripts
1) Create venv and install deps
macOS:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Windows (PowerShell):
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
Windows (CMD):
python -m venv .venv
.\.venv\Scripts\activate.bat
pip install -r requirements.txt
2) Configure Hugging Face model access
Set these environment variables (local dev or Hugging Face Spaces secrets):
export HF_TOKEN=hf_your_token_here
export RAG_HF_MODEL=Qwen/Qwen2.5-7B-Instruct-1M:featherless-ai
export RAG_HF_PROVIDER_SUFFIX=featherless-ai
export RAG_LLM_BACKEND=hf-router
Optional: set RAG_HF_PROVIDER_SUFFIX if your model id is missing the provider suffix.
3) Prepare sources
- Books: drop PDFs into
data/raw_pdfs/and add entries tosources.json - Articles: edit
sources_articles.json(list of{id,type,url,publisher}) - MCP docs (optional):
bash scripts/refresh_mcp.sh(downloads the latest snapshot)
4) Build datasets
Recommended one-command rebuild:
make rebuild
Outputs to data/normalized/:
chunks_books.jsonl+manifest_books.jsonchunks_articles.jsonl+manifest_articles.jsonchunks.jsonl+manifest.json(merged)
5) Run the app
streamlit run app.py
Open http://localhost:8501. On first run, the app builds FAISS indexes:
data/cache/index_books.faiss(local)data/cache/index_articles.faiss(local)
Configuration
You can override defaults via environment variables:
export RAG_BOOK_CHUNKS_PATH=data/normalized/chunks_books.jsonl
export RAG_ARTICLE_CHUNKS_PATH=data/normalized/chunks_articles.jsonl
export RAG_BOOK_INDEX_PATH=data/cache/index_books.faiss
export RAG_ARTICLE_INDEX_PATH=data/cache/index_articles.faiss
export RAG_BOOK_MANIFEST_PATH=data/normalized/manifest_books.json
export RAG_ARTICLE_MANIFEST_PATH=data/normalized/manifest_articles.json
export RAG_EMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2
export HF_TOKEN=hf_your_token_here
export RAG_HF_MODEL=Qwen/Qwen2.5-7B-Instruct-1M:featherless-ai
export RAG_HF_PROVIDER_SUFFIX=featherless-ai
export RAG_LLM_BACKEND=hf-router
export RAG_MAX_CONTEXT_TOKENS=6000
export RAG_INJECT_MAX_CHUNKS=6
export RAG_MAX_GENERATION_TOKENS=512
export RAG_RETRIEVE_TOPK_MULT=2
export RAG_OUT_DIR=data/normalized
export RAG_ARTICLE_SOURCES=sources_articles.json
Deploy to Hugging Face Spaces
- Create a new Space (Streamlit SDK) and push this repo.
- Enable Persistent Storage and set caches:
HF_HOME=/data/.huggingfaceSENTENCE_TRANSFORMERS_HOME=/data/.sentence-transformers
- In Space Settings → Secrets, set
HF_TOKEN(required) and optionallyGITHUB_TOKEN. - In Space Settings → Variables, set
RAG_HF_MODELandRAG_LLM_BACKEND=hf-router. - Optional:
RAG_HF_PROVIDER_SUFFIX,RAG_INJECT_MAX_CHUNKS, andRAG_RETRIEVE_TOPK_MULT.
With persistent storage enabled, FAISS indexes are stored in /data/rag_cache and reused across restarts. They rebuild only when the normalized chunk/manifest files change.
Common maintenance tasks
Add new books (PDFs)
- Add PDFs to
data/raw_pdfs/ - Update
sources.json - Run
make rebuild - (Optional)
make clean-index streamlit run app.py
Add new articles
- Update
sources_articles.json - Run
make rebuild - (Optional)
make clean-index streamlit run app.py
Rebuild indexes only
make clean-index
Scripts and commands reference
app.py- Streamlit UI; loads chunk files and builds/loads FAISS indexes.scripts/normalize_all.py- Parse PDFs and MCP markdowns intochunks_books.jsonlandmanifest_books.json.scripts/ingest_articles.py- Fetch URLs fromsources_articles.jsonand writechunks_articles.jsonlandmanifest_articles.jsonplusarticles_ingest_report.json.scripts/merge_chunks.py- Merge multiple chunk files and manifests; emitschunks.jsonl,manifest.json, andmerge_report.json.scripts/rebuild_all.sh- Run normalize, ingest, and merge in order (same asmake rebuild).scripts/refresh_mcp.sh- Downloadllms-full.txtand regenerate MCP markdowns inmcp/.scripts/split_mcp.py- Split a single MCP snapshot text file into topic markdown files.refresh_mcp.sh- Convenience wrapper forscripts/refresh_mcp.sh.normalize_all.py,ingest_articles.py,merge_chunks.py,split_mcp.py- Convenience wrappers for thescripts/versions.Makefile-make install,make rebuild,make clean-index,make run.build_kb.py- Legacy entry point referencing a removedsrc/package; not used by the current app.
License
Apache License 2.0. See LICENSE.
Troubleshooting
- If you see
No chunks loaded, ensuredata/normalized/*.jsonlexists and has content. - If the Hugging Face request fails, verify
HF_TOKENis set and the model name or endpoint is correct. - If article ingestion skips sources, check
data/normalized/articles_ingest_report.json.