Spaces:

Param20h
/

PDF-Assit_RAG

Running

App Files Files Community

Yuvraj Sarathe commited on 10 days ago

Commit

675aa29

1 Parent(s): fa8f11a

Readme and .env.example updated with proper comments

Browse files

Files changed (2) hide show

.env.example +115 -10
README.md +24 -16

.env.example CHANGED Viewed

@@ -1,30 +1,135 @@
-# ── App Config ───────────────────────────────────────
 SECRET_KEY=change-me-in-production
-DATABASE_URL=sqlite:///./data/app.db
 # ── Environment & CORS ──────────────────────────────
 ENVIRONMENT=development
-# In production, set ENVIRONMENT=production and list your allowed origins:
-# ALLOWED_ORIGINS=https://yourapp.com,https://www.yourapp.com
 ALLOWED_ORIGINS=http://localhost:3000,http://localhost:7860
-# ── HuggingFace (Required for LLM) ──────────────────
 HF_TOKEN=your_huggingface_token_here
-# ── LLM Model (Optional — defaults shown) ───────────
 # LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.3
 # LLM_TEMPERATURE=0.3
 # LLM_MAX_NEW_TOKENS=1024
-# ── Embeddings (Optional — defaults shown) ───────────
 # EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
 # ── RAG Config (Optional — defaults shown) ───────────
 # CHUNK_SIZE=1000
 # CHUNK_OVERLAP=200
 # TOP_K_RETRIEVAL=10
 # TOP_K_RERANK=5
-# ── Upload (Optional) ───────────────────────────────
-# UPLOAD_DIR=./data/uploads
-# MAX_FILE_SIZE_MB=50

+#  Document AI Analyst — Environment Configuration
+#  Copy this file to backend/.env and fill in your values:
+#    cp .env.example backend/.env
+# ── Application Config ──────────────────────────────────────────────
+# Secret key for signing JWT tokens and Flask sessions.
+# Generate one: python -c "import secrets; print(secrets.token_urlsafe(32))"
+# Required
 SECRET_KEY=change-me-in-production
 # ── Environment & CORS ──────────────────────────────
+# Runtime environment. Set to "production" in production.
+# In production, ALLOWED_ORIGINS must be set explicitly (CORS will reject all others).
+# Optional — defaults to "development"
 ENVIRONMENT=development
+# Debug mode. Enables detailed error pages and auto-reload.
+# Do NOT enable in production.
+# Optional — defaults to False
+# DEBUG=False
+# Comma-separated list of allowed CORS origins.
+# Only used when ENVIRONMENT=production. When empty or during development, all origins are allowed.
+# Optional — defaults to "http://localhost:3000,http://localhost:7860"
 ALLOWED_ORIGINS=http://localhost:3000,http://localhost:7860
+# ── Database ─────────────────────────────────────────────────
+# SQLAlchemy database connection string.
+# Default: SQLite stored at ./data/app.db
+# For Postgres: postgresql+asyncpg://user:pass@host:5432/dbname
+# Optional — defaults to sqlite:///./data/app.db
+# DATABASE_URL=sqlite:///./data/app.db
+# ── Authentication ──────────────────────────────────────────
+# JWT signing algorithm. Leave as default unless you know what you're doing.
+# Optional — defaults to "HS256"
+# JWT_ALGORITHM=HS256
+# JWT token expiry in hours. After this period, users must re-login.
+# Optional — defaults to 72
+# JWT_EXPIRY_HOURS=72
+# ── File Upload ─────────────────────────────────────────────
+# Directory where uploaded documents (PDFs, DOCXs, etc.) are stored.
+# Optional — defaults to "./data/uploads"
+# UPLOAD_DIR=./data/uploads
+# Maximum upload file size in megabytes.
+# Optional — defaults to 50
+# MAX_FILE_SIZE_MB=50
+# Comma-separated list of allowed file extensions for upload.
+# Optional — defaults to "pdf,docx,txt,md"
+# ALLOWED_EXTENSIONS=pdf,docx,txt,md
+# ── HuggingFace (Required for LLM inference) ────────────────
+# HuggingFace API token. Used to call the Inference API for LLM responses.
+# Get yours: https://huggingface.co/settings/tokens (free tier available)
+# Required (app won't generate answers without it)
 HF_TOKEN=your_huggingface_token_here
+# ── LLM Configuration ───────────────────────────────────────
+# HuggingFace model ID used for answer generation.
+# Check available models: https://huggingface.co/models?inference=warm&sort=trending
+# Optional — defaults to "mistralai/Mistral-7B-Instruct-v0.3"
 # LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.3
+# Sampling temperature (0.0 = deterministic, 1.0 = very creative).
+# Optional — defaults to 0.3
 # LLM_TEMPERATURE=0.3
+# Maximum number of tokens the LLM can generate per response.
+# Optional — defaults to 1024
 # LLM_MAX_NEW_TOKENS=1024
+# ── Embeddings (Optional — defaults shown)──────────────────────────────────────────────
+# SentenceTransformer model ID for generating document embeddings.
+# Model is downloaded once and cached locally. No external API call.
+# Optional — defaults to "sentence-transformers/all-MiniLM-L6-v2"
 # EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
+# Dimension of the embedding vectors (must match the model output).
+# Optional — defaults to 384
+# EMBEDDING_DIMENSION=384
 # ── RAG Config (Optional — defaults shown) ───────────
+# ── ChromaDB (Vector Store) ─────────────────────────────────
+# Directory where ChromaDB persists its vector index to disk.
+# Optional — defaults to "./data/chroma_db"
+# CHROMA_PERSIST_DIR=./data/chroma_db
+# ── Document Chunking ───────────────────────────────────────
+# Number of characters per document chunk.
+# Larger chunks give more context; smaller chunks improve retrieval precision.
+# Optional — defaults to 1000
 # CHUNK_SIZE=1000
+# Character overlap between consecutive chunks. Helps maintain context at boundaries.
+# Optional — defaults to 200
 # CHUNK_OVERLAP=200
+# ── Retrieval ───────────────────────────────────────────────
+# Number of candidate chunks retrieved from the vector store during semantic search.
+# Optional — defaults to 10
 # TOP_K_RETRIEVAL=10
+# Number of top chunks passed to the LLM after cross-encoder reranking.
+# Must be ≤ TOP_K_RETRIEVAL.
+# Optional — defaults to 5
 # TOP_K_RERANK=5
+# Cross-encoder model used for reranking retrieved chunks by relevance.
+# Optional — defaults to "cross-encoder/ms-marco-MiniLM-L-6-v2"
+# RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2
+# ── (Legacy) Flask-Only Variables ───────────────────────────
+# These are only used if you run the old Flask app (app.py) instead of FastAPI.
+# They are ignored by the new FastAPI backend.
+# MONGO_URI=mongodb://localhost:27017/pdf_assistant
+# GOOGLE_CLIENT_ID=your_google_client_id
+# GOOGLE_CLIENT_SECRET=your_google_client_secret

README.md CHANGED Viewed

@@ -378,22 +378,30 @@ docker compose up --build
 ## 📦 Environment Variables
-| Variable | Required | Default | Description |
-|---|---|---|---|
-| `HF_TOKEN` | ✅ | — | HuggingFace API token for LLM inference |
-| `SECRET_KEY` | ✅ | — | JWT signing secret (use a strong random string) |
-| `DATABASE_URL` | ❌ | `sqlite:///./data/app.db` | SQLAlchemy database URL |
-| `UPLOAD_DIR` | ❌ | `./data/uploads` | Directory for uploaded files |
-| `CHROMA_PERSIST_DIR` | ❌ | `./data/chroma_db` | ChromaDB persistence path |
-| `LLM_MODEL` | ❌ | `Qwen/Qwen2.5-72B-Instruct` | HuggingFace model ID |
-| `LLM_TEMPERATURE` | ❌ | `0.3` | LLM sampling temperature |
-| `LLM_MAX_NEW_TOKENS` | ❌ | `1024` | Max tokens per response |
-| `EMBEDDING_MODEL` | ❌ | `all-MiniLM-L6-v2` | SentenceTransformer model |
-| `CHUNK_SIZE` | ❌ | `1000` | Document chunk size (characters) |
-| `CHUNK_OVERLAP` | ❌ | `200` | Overlap between chunks |
-| `TOP_K_RETRIEVAL` | ❌ | `10` | Candidates retrieved from vector store |
-| `TOP_K_RERANK` | ❌ | `5` | Final chunks passed to LLM after reranking |
-| `MAX_FILE_SIZE_MB` | ❌ | `50` | Maximum upload file size |
 <br/>

 ## 📦 Environment Variables
+| Variable | Required | Default | Description | Where to Get It |
+|---|---|---|---|---|
+| `SECRET_KEY` | ✅ | — | JWT signing & session secret. Use a strong random string. | Generate: `python -c "import secrets; print(secrets.token_urlsafe(32))"` |
+| `HF_TOKEN` | ✅ | — | HuggingFace API token for LLM inference via Inference API. | [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) (free) |
+| `ENVIRONMENT` | ❌ | `development` | Runtime mode. Set to `production` for deployment to lock CORS. | — |
+| `DEBUG` | ❌ | `False` | Enable debug mode with detailed error pages. Never enable in production. | — |
+| `ALLOWED_ORIGINS` | ❌ | `http://localhost:3000,http://localhost:7860` | Comma-separated CORS origins (only enforced in production). | Your deployed domain(s) |
+| `DATABASE_URL` | ❌ | `sqlite:///./data/app.db` | SQLAlchemy database connection string. | SQLite (default), or your Postgres/MySQL connection string |
+| `JWT_ALGORITHM` | ❌ | `HS256` | JWT signing algorithm. | — |
+| `JWT_EXPIRY_HOURS` | ❌ | `72` | JWT token lifetime in hours before re-login is required. | — |
+| `UPLOAD_DIR` | ❌ | `./data/uploads` | Local directory for storing uploaded documents. | — |
+| `MAX_FILE_SIZE_MB` | ❌ | `50` | Maximum allowed upload file size in MB. | — |
+| `ALLOWED_EXTENSIONS` | ❌ | `pdf,docx,txt,md` | Comma-separated list of permitted file extensions. | — |
+| `CHROMA_PERSIST_DIR` | ❌ | `./data/chroma_db` | Directory where ChromaDB persists its vector index. | — |
+| `LLM_MODEL` | ❌ | `Qwen/Qwen2.5-72B-Instruct` | HuggingFace model ID for answer generation. | [huggingface.co/models](https://huggingface.co/models?inference=warm&sort=trending) |
+| `LLM_TEMPERATURE` | ❌ | `0.3` | LLM sampling temperature (0 = deterministic, 1 = creative). | — |
+| `LLM_MAX_NEW_TOKENS` | ❌ | `1024` | Maximum tokens per LLM response. | — |
+| `EMBEDDING_MODEL` | ❌ | `sentence-transformers/all-MiniLM-L6-v2` | SentenceTransformer model for local embeddings (no external API). | [huggingface.co/sentence-transformers](https://huggingface.co/sentence-transformers) |
+| `EMBEDDING_DIMENSION` | ❌ | `384` | Embedding vector dimension (must match the model). | — |
+| `RERANKER_MODEL` | ❌ | `cross-encoder/ms-marco-MiniLM-L-6-v2` | Cross-encoder model for reranking retrieved chunks by relevance. | [huggingface.co/cross-encoder](https://huggingface.co/cross-encoder) |
+| `CHUNK_SIZE` | ❌ | `1000` | Characters per document chunk. Larger = more context, smaller = better precision. | — |
+| `CHUNK_OVERLAP` | ❌ | `200` | Overlap between consecutive chunks to maintain boundary context. | — |
+| `TOP_K_RETRIEVAL` | ❌ | `10` | Candidate chunks retrieved from vector store during semantic search. | — |
+| `TOP_K_RERANK` | ❌ | `5` | Final chunks passed to the LLM after reranking (must be ≤ `TOP_K_RETRIEVAL`). | — |
 <br/>