# VGEC RAG Chatbot — Codebase Documentation > **Generated:** 2026-03-25 > **Version:** 1.0.0 > **Scope:** Full system — ingestion, retrieval, classification, API, evaluation --- ## Table of Contents 1. [Project Overview](#1-project-overview) 2. [System Architecture](#2-system-architecture) 3. [Schema & Data Model](#3-schema--data-model) 4. [Retrieval Pipeline](#4-retrieval-pipeline) 5. [Key Classes & Modules](#5-key-classes--modules) 6. [Evaluation & Metrics](#6-evaluation--metrics) 7. [Known Limitations](#7-known-limitations) 8. [File Structure](#8-file-structure) --- ## 1. Project Overview ### Purpose **VGEC RAG Chatbot** is a Retrieval-Augmented Generation (RAG) chatbot for **Vishwakarma Government Engineering College (VGEC), Chandkheda, Gujarat**. It allows students, faculty, and visitors to query structured information about the institution — departments, faculty, syllabus, labs, intake capacity, and more — through natural language. ### Domain - **Institution:** VGEC (Government Engineering College, Gujarat) - **Data Coverage:** Department-level information for multiple disciplines (Computer Engineering, Civil, Electrical, IT, ECE, etc.) - **Topics:** Faculty lists, lab facilities, syllabus details, HOD info, research activities, intake capacity, achievements ### Tech Stack | Layer | Technology | |---|---| | **API Framework** | FastAPI | | **Vector Database** | ChromaDB (persistent, local) | | **Embeddings** | Google `gemini-embedding-001` (via `langchain-google-genai`) | | **LLM (Cloud)** | Google Gemini `gemini-2.5-flash-lite` | | **LLM (Local)** | `EXAONE-3.5-2.4B-Instruct-Q4_K_M.gguf` via `llama-cpp-python` | | **NLP / Preprocessing** | spaCy (`en_core_web_sm`), NLTK (PorterStemmer) | | **Classifier** | Scikit-learn `LogisticRegression` + `SentenceTransformer` (`MongoDB/mdbr-leaf-mt`) | | **BM25** | `langchain-community` `BM25Retriever` | | **Chunking** | LangChain `RecursiveCharacterTextSplitter` | | **Config** | Pydantic `BaseSettings` (`.env`-backed) | ### Key Features Implemented - ✅ Structured JSON ingestion with intent-aware chunking - ✅ Hybrid retrieval: BM25 + vector search fused via Reciprocal Rank Fusion (RRF) - ✅ Intent/metadata classification with confidence-gated ChromaDB filters - ✅ Abbreviation expansion (`CE` → `Computer Engineering`, etc.) - ✅ Multi-turn conversation history support - ✅ Dual LLM backend with automatic fallback (Gemini ↔ Local) - ✅ Full CRUD REST API for vector store management - ✅ Offline evaluation endpoint (MRR, hit rate, noise rate) - ✅ Classifier accuracy evaluation endpoint --- ## 2. System Architecture ### Component Diagram ``` ┌──────────────────────────┐ │ FastAPI App │ │ /api/v1/rag /vector │ └──────────┬───────────────┘ │ DI (lru_cache) ┌──────────▼───────────────┐ │ RAGService │ │ (core orchestrator) │ └──┬───────────┬────────────┘ │ │ ┌─────────────▼──┐ ┌───▼──────────────────┐ │ IngestionService│ │ HybridRetrievalService│ │ (write path) │ │ (read path) │ └──────┬──────── ┘ └───┬──────────┬─────── ┘ │ │ │ ┌──────────▼──┐ ┌──────────▼──┐ ┌────▼──────────┐ │ FileService │ │ ClassifierSvc│ │ VectorStore │ │ (file +meta) │ │(clf predict) │ │ (ChromaDB) │ └──────────────┘ └─────────────┘ └───────────────┘ ``` ### Data Flow #### Ingestion Path ``` File Upload (PDF/MD/TXT/JSON) │ ▼ FileService.read_file() ← type-aware loading (PyMuPDF for PDF) │ returns: Document + metadata ▼ FileService.write_file() ← persist copy to data/documents/ │ ▼ IngestionService.handle_*_docs() ← route by file extension │ ├─ JSON → handle_json_docs() ← intent-aware chunks (list / detail / count) └─ text → handle_text_docs() ← RecursiveCharacterTextSplitter + normalize() │ ▼ VectorStore.add_documents() ← embed + upsert into ChromaDB │ ▼ FileService.patch_metadata() ← update ingestion record JSON (chunk count, timing, size) ``` #### Query Path ``` User Question │ ▼ preprocess_query() ← tokenize + strip stopwords (spaCy) + normalize │ ▼ HybridRetrievalService.retrieve() │ ├─ clf.expand_abbreviations() ← CE → Computer Engineering ├─ clf.predict_with_filter() ← LogReg predict → Chroma $and/$or filter ├─ _vector_rank() ← ChromaDB similarity_search_with_score (k=15) ├─ _bm25_rank() ← BM25 over the vector candidate pool ├─ _reciprocal_rank_fusion() ← weighted RRF merge ├─ metadata score boosting ← multiply fused scores for confident matches └─ _apply_title_boost() ← per-query-word title match bonus │ ▼ get_references_v2() ← filter by threshold, build context string │ ▼ LLM.invoke(prompt) ← Gemini or local LlamaCpp │ ▼ Return: { answer, references, context, threshold_used, k_used } ``` ### External Dependencies | Dependency | Role | Provider | |---|---|---| | ChromaDB | Persistent vector store | Local disk | | Google Gemini API | Embeddings + LLM generation | Google Cloud | | LlamaCpp (GGUF model) | Local LLM fallback | Local CPU | | Sentence Transformers | Classifier feature extraction | HuggingFace Hub | | spaCy `en_core_web_sm` | POS tagging / lemmatization | Local | --- ## 3. Schema & Data Model ### Source JSON Format Source data files (e.g. `computer_eng.json`) follow this schema: ```json { "id": "computer-engineering-department", "name": "Computer Engineering Department", "source": "https://www.vgecg.ac.in/department.php?dept=3", "category": "computer_eng", "type": "department", "created_date": "2026-02-19", "content": { "": { "list": ["item 1", "item 2", "..."], "details": "Paragraph describing the topic." } } } ``` **Top-level fields:** | Field | Type | Description | |---|---|---| | `id` | string | Unique document identifier | | `name` | string | Human-readable institution/department name | | `source` | string | Authoritative URL | | `category` | string | Department slug (e.g. `computer_eng`) | | `type` | string | Document type (e.g. `department`) | | `created_date` | string (ISO) | Data creation date | | `content` | object | Topic map; each key = a topic | ### Chunk Metadata Schema (stored in ChromaDB) Every vector chunk stored in Chroma carries the following metadata: | Field | Type | Source | |---|---|---| | `id` | string (UUID) | Auto-generated | | `title` | string | Document name / topic key | | `source` | string | Source URL | | `source_file` | string | Filename (e.g. `computer_eng.json`) | | `type` | string | Taxonomy level 1 (e.g. `department`) | | `category` | string | Taxonomy level 2 (e.g. `computer_eng`) | | `topic` | string | Taxonomy level 3 (e.g. `faculty`) | | `intent` | string | Chunk intent: `list`, `detail`, or `count` | | `chunk_index` | int | Sequential index within file | | `created_date` | string (ISO) | Ingestion timestamp | | `updated_at` | string (ISO) | Last modification timestamp | | `ext` | string | Source file extension (`json`, `pdf`, `md`, `txt`) | ### Hierarchical Taxonomy The classifier predicts and ChromaDB filters operate on a 3-level hierarchy: ``` type └── category └── topic └── intent (list | detail | count) ``` **Example mapping (Computer Engineering):** ``` type: "department" └── category: "computer_eng" ├── topic: "faculty" → intent: list | detail ├── topic: "lab" → intent: list | detail ├── topic: "syllabus" → intent: list | detail ├── topic: "hod" → intent: list | detail ├── topic: "intake" → intent: list | detail ├── topic: "research" → intent: list | detail └── topic: "achievements" ``` ### Document Chunking Strategy **JSON documents** use a hand-crafted, intent-aware strategy in `IngestionService.handle_json_docs()`: | Intent | Chunk Content | Metadata | |---|---|---| | `list` | Numbered list: `1. item\n2. item\n...` | `intent=list` | | `count` | `"Total : N"` (auto-generated) | `intent=count` | | `detail` | Raw paragraph text | `intent=detail` | **Text/PDF/Markdown documents** use `RecursiveCharacterTextSplitter`: - Default: `chunk_size=500`, `chunk_overlap=100` - Separator priority: `\n\n` → `\n` → ` ` → (character) - Markdown variant respects `---` section delimiters - Content is passed through `normalize()` (tokenize + strip blanks) before storage --- ## 4. Retrieval Pipeline ### Query Processing Flow ```python # Step 1: Normalize input question = preprocess_query(question) # → spaCy POS filter (NOUN, PROPN, VERB, NUM, ADJ) + lemmatize + strip stopwords # Step 2: Expand abbreviations processed_query = clf.expand_abbreviations(query) # → "CE dept" → "computer engineering department" # Step 3: Classify intent/metadata filters = clf.predict_with_filter([processed_query]) # → {"$and": [{"type": "department"}, {"intent": "list"}, {"$or": [...]}]} # Step 4: Vector search with optional filter raw_results = chroma.similarity_search_with_score(query, k=15, filter=filters) # Fallback: if filtered results empty, retry without filter # Step 5: BM25 re-rank over vector candidates bm25_results = BM25Retriever.from_documents(candidate_docs) # Step 6: RRF fusion fused_score(d) = bm25_weight * 1/(rrf_k + rank_bm25) + vector_weight * 1/(rrf_k + rank_vec) # Step 7: Metadata confidence boosting if doc.metadata[field] == predicted_val and conf > 0.90: result.fused_score *= boost_factor # 1.10–1.20 # Step 8: Title word boost for word in query_words: if word in doc.title: result.fused_score += title_boost_per_word # 0.004 # Step 9: Threshold filter + sort + top-k results = [r for r in results if r.fused_score >= threshold] ``` ### Classifier Thresholds The `Classifier` uses two separate threshold tables: **Prediction threshold** — below this, the field is set to `None` (not used at all): | Field | Threshold | |---|---| | `type` | 0.40 | | `category` | 0.40 | | `topic` | 0.50 | | `intent` | 0.60 | **Filter threshold** — above this, the field becomes a hard ChromaDB `$and` filter: | Field | Threshold | |---|---| | `type` | 0.65 | | `category` | 0.65 | | `topic` | 0.70 | ### Filter Construction Logic (`_build_filter`) ```python # Gate: if type confidence < 0.65 → return None (full scan) # Hard anchors (always included if type passes): # - type == predicted_type # - intent == predicted_intent (special: "count" expands to count OR detail) # Soft hints (combined as $or): # - category == predicted_category (if conf >= 0.65, else "general") # - topic == predicted_topic (if conf >= 0.70, else "general") ``` ### Hybrid Retrieval Config (Defaults) | Parameter | `hybrid_query` | `search_docs` | |---|---|---| | `candidate_k` | 15 | 15 | | `top_k` (final) | `settings.similarity_top_k` (8) | k (param) | | `bm25_weight` | 0.45 | 0.70 | | `vector_weight` | 0.55 | 0.30 | | `rrf_k` | 20 | 20 | | `bm25_k1` | 1.2 | 1.5 | | `bm25_b` | 0.9 | 0.75 | | `title_boost_per_word` | 0.004 | 0.004 | | `score_threshold` | 0.4 | 0.4 | > **Note:** `search_docs` is BM25-heavy (0.70) since it is used for keyword-oriented document browsing, while `hybrid_query` is vector-heavy for semantic QA. --- ## 5. Key Classes & Modules ### Services (`app/services/`) #### `RAGService` Main orchestrator. Singleton via `lru_cache` in `dependencies.py`. | Method | Description | |---|---| | `query()` | Semantic-only QA (vector search → LLM) | | `hybrid_query()` | Hybrid QA (BM25 + vector → RRF → LLM) | | `search_docs()` | BM25-heavy document search, no LLM | | `ingest_documents()` | Ingest a file path into the vector store | | `get_filenames()` | Return all tracked file metadata records | | `test_queries()` | Batch retrieval evaluation (MRR, precision, noise) | | `test_classifier()` | Batch classifier accuracy evaluation | | `delete_database()` | Drop the entire ChromaDB collection | #### `HybridRetrievalService` Stateless per-request service created inline by `RAGService`. | Method | Description | |---|---| | `retrieve(query)` | Full hybrid retrieval pipeline; returns `List[RetrievalResult]` | | `_vector_rank()` | Chroma similarity search + classifier filter | | `_bm25_rank()` | BM25 over candidate pool | | `_reciprocal_rank_fusion()` | Merge both ranked lists via RRF | | `_apply_title_boost()` | Word-level title match score bonus | **`RetrievalResult` dataclass:** ```python @dataclass class RetrievalResult: document: Document fused_score: float bm25_rank: Optional[int] vector_rank: Optional[int] title_boost: float ``` #### `Classifier` Loaded at startup from a pickled pipeline (`chatbot_classifier.pkl`). | Method | Description | |---|---| | `predict(queries)` | Returns list of `{type, category, topic, intent, *_conf}` dicts | | `predict_with_filter(queries)` | Returns a ChromaDB-compatible filter dict or `None` | | `expand_abbreviations(text)` | Regex-based abbreviation expansion | | `get_features(queries)` | Build `[SentenceTransformer embedding | TF-IDF]` feature matrix | | `train_models(df)` | Train 4 LogisticRegression classifiers (offline use) | #### `IngestionService` | Method | Description | |---|---| | `ingest(file_path)` | Load + chunk a file; returns `List[Document]` | | `handle_json_docs()` | Intent-aware chunking for structured JSON data | | `handle_text_docs()` | Recursive character splitting for unstructured text | | `get_records()` | Delegate to `FileService.get_records()` | | `delete_record(filename)` | Remove a file's metadata record | | `path_record(path, metadata)` | Patch ingestion stats after indexing | #### `FileService` | Method | Description | |---|---| | `read_file(path)` | Load file content; dispatches by extension | | `write_file(path, content, metadata)` | Persist file to `data/documents/` | | `patch_metadata(path, metadata)` | Merge new fields into existing record | | `get_records()` | Return all ingestion records dict | | `delete_record(filename)` | Remove a record from `.json` | #### `VectorStore` Thin wrapper around `langchain_chroma.Chroma`. | Method | Description | |---|---| | `get()` | Retrieve all documents | | `get_by_id(ids)` | Retrieve specific documents by ID | | `add_documents(docs)` | Embed + insert, skipping empty chunks | | `update_document(id, doc)` | Delete then re-insert with same ID | | `delete(ids)` | Remove documents by ID list | | `similarity_search_with_score()` | Wrapped Chroma search | ### Utilities (`app/utils/`) #### `preprocessing.py` | Function | Description | |---|---| | `preprocess(text)` | spaCy POS filter + lemmatize + stopword removal → joined string | | `normalize(text)` | Tokenize + strip blanks (lightweight, no POS) | | `preprocess_query(query)` | Applies `normalize()` to user queries | | `preprocess_documents(docs)` | Applies `preprocess()` to a document list in-place | | `preprocess_filename(path)` | Sanitize filename (remove special chars, lowercase) | #### `document_helpers.py` | Function | Description | |---|---| | `get_references_v2(docs, threshold)` | Convert `RetrievalResult` list → references dict + context string | | `get_references(docs, threshold)` | Same for raw `(Document, distance)` tuples (used by `query()`) | | `build_metadata(path)` | Parse YAML frontmatter from `.md`/`.txt` files | | `create_documents(chunks, ...)` | Attach standard metadata (UUID, timestamps, indices) to chunks | | `create_documents_from_text(text)` | Full pipeline: frontmatter parse → split → metadata attach | | `clean_metadata(metadata)` | Serialize datetime, coerce non-allowed types to string | #### `model_factory.py` | Function | Description | |---|---| | `get_embedding_model()` | Returns `GoogleGenerativeAIEmbeddings` | | `get_gemini_model()` | Returns `ChatGoogleGenerativeAI` | | `get_local_model()` | Returns `ChatLlamaCpp` (GGUF, CPU inference) | | `get_llm_model(provider)` | Dispatches to Gemini or Local with fallback logic | ### API Routes (`app/api/routes/`) #### `rag.py` — prefix `/api/v1/rag` | Method | Endpoint | Description | |---|---|---| | GET | `/` | Health check | | POST | `/` | Semantic query | | POST | `/hybrid_query` | Hybrid RAG query (primary endpoint) | | POST | `/similarity_search` | Hybrid retrieval, no LLM response | | POST | `/search` | BM25-heavy document search | | POST | `/test` | Batch retrieval evaluation | | POST | `/test_classifier` | Classifier accuracy evaluation | | GET | `/test_classifier_dataset` | Run built-in test dataset, cache result | #### `vector_store.py` — prefix `/api/v1/vector` | Method | Endpoint | Description | |---|---|---| | GET | `/` | List all documents (paginated, filterable) | | GET | `/filenames` | List ingested file records | | GET | `/{id}` | Get single document by ChromaDB ID | | POST | `/` | Upload + ingest file | | PUT | `/{id}` | Update document content/metadata | | DELETE | `/ids` | Bulk delete by ID list | | DELETE | `/{id}` | Delete single document | | DELETE | `/` | Filter-based delete (filename/source/contains) | ### Configuration (`app/core/config.py`) All settings are read from `.env` via Pydantic `BaseSettings`: ```python class Settings(BaseSettings): # Paths collection_name: str = "classifier_test_1" persist_directory: str = "./data/vector_stores/classifier_test_1" # Chunking chunk_size: int = 500 chunk_overlap: int = 100 # Retrieval similarity_top_k: int = 8 similarity_threshold: float = 0.4 # LLM Provider llm_provider: Literal["gemini", "local"] = "local" enable_fallback: bool = True # Models embedding_model_name: str = "models/gemini-embedding-001" gemini_model_name: str = "gemini-2.5-flash-lite" local_model_name: str = "EXAONE-3.5-2.4B-Instruct-Q4_K_M.gguf" # Generation max_output_tokens: int = 2048 local_max_tokens: int = 512 # Auth google_api_key: str # required — must be in .env ``` --- ## 6. Evaluation & Metrics ### Retrieval Evaluation (`test_queries` / `POST /api/v1/rag/test`) Tests each (question, expected_document, expected_chunk_index) triple against `hybrid_query`: | Metric | Formula | Interpretation | |---|---|---| | **Hit Rate** | `hits / total` | % of questions where the exact chunk was retrieved | | **Top-1 Hit Rate** | `rank==1 hits / total` | % of questions where exact chunk was top result | | **MRR** | `mean(1/rank)` | Mean Reciprocal Rank; higher = correct result ranked earlier | | **Doc Precision** | `correct_source_chunks / all_chunks` | How many retrieved chunks came from the right document | | **Doc Recall** | `1 if any correct_source_chunk else 0` | Did we retrieve at least one chunk from the right document? | | **Doc Noise** | `wrong_source_chunks / all_chunks` | Proportion of off-topic chunks in the result set | | **Error Rate** | `1 - hit_rate` | Miss rate for exact chunk retrieval | **Test Input Schema:** ```python class TestRequestSchema(BaseModel): tests: List[Test] # question + document + chunk_index k: int = 5 threshold: float = 0.4 ``` ### Classifier Evaluation (`test_classifier` / `POST /api/v1/rag/test_classifier`) Evaluates predictions for all 4 classification fields (`type`, `category`, `topic`, `intent`): | Metric | Notes | |---|---| | **Accuracy** | `sklearn.accuracy_score` | | **Precision (macro)** | `zero_division=0` | | **Recall (macro)** | `zero_division=0` | | **F1 Macro** | Unweighted average across classes | | **F1 Weighted** | Class-frequency weighted | | **Classification Report** | Full per-class breakdown (`output_dict=True`) | A bundled test dataset is stored in `app/utils/tests.py` as `classifier_test_dataset` and can be executed via `GET /api/v1/rag/test_classifier_dataset`. Results are **memoized** on the `RAGService.evaluation` dict for the lifetime of the server process. --- ## 7. Known Limitations ### Technical Debt - **`preprocess_query` is incomplete.** The function signature has an LLM-powered query rewriting block that is commented out. Currently it just calls `normalize()` (tokenize only), which means no stopword removal or lemmatization is applied to user queries (only to stored documents). - **`search_docs` does not honour `filename` as a metadata filter in Chroma.** The filter is applied in Python post-retrieval, which is inefficient for large collections. - **Count intent is synthetic.** The `"Total : N"` chunk is an auto-generated chunk during ingestion, not from the source document. If source data changes, stale count chunks can remain indexed. - **`VectorStore.get_dict()` has a `print(type(rows))`** debug statement left in production code. - **`FileService.__init__` docstring** has an extra backtick: `"`\`` class docstring`. ### Planned but Unimplemented - **Query rewriting via local LLM** — skeleton is commented out in `preprocess_query()`. - **Semantic caching** — no query result memoization at the API layer. - **Re-ranker** — no cross-encoder re-ranking step; relies only on RRF + boosting. - **`topic` field is not included in the ChromaDB hard filter** — only `type` + `intent` are hard-anchored; `category` and `topic` are soft `$or` hints. ### Performance Bottlenecks - **Local LLM (LlamaCpp)** is CPU-only with `n_ctx=8096` and `n_threads=4`. Response latency is high (~10–30s) on low-RAM systems. - **Classifier uses `SentenceTransformer` + `TF-IDF` features** — inference runs on every request with no caching of query embeddings. - **BM25 corpus is rebuilt from scratch per request** — `BM25Retriever.from_documents()` is called inside `_bm25_rank()` each time. - **`classify_test_dataset` in `app/utils/tests.py`** is a very large file (1.8MB) loaded at import time. - **The memoized evaluation** in `rag_service.evaluation` is not thread-safe if the server runs with multiple workers. --- ## 8. File Structure ``` VGEC-RAG-Chatbot/ │ ├── app/ # Application package │ ├── main.py # FastAPI app, router mounting, CORS middleware │ ├── core/ │ │ ├── config.py # Pydantic Settings (all tuneable params) │ │ └── paths.py # Path constants helper │ │ │ ├── api/ │ │ ├── dependencies.py # lru_cache singleton for RAGService │ │ ├── routes/ │ │ │ ├── rag.py # /rag endpoints (query, test, classifier) │ │ │ ├── vector_store.py # /vector endpoints (CRUD for ChromaDB) │ │ │ └── settings.py # /settings endpoints │ │ └── schemas/ │ │ ├── requests.py # RAGRequest, PaginationParams, etc. │ │ └── tests.py # TestRequestSchema, TestClassifierReqSchema │ │ │ ├── services/ │ │ ├── rag_service.py # RAGService (main orchestrator) │ │ ├── hybrid_retrieval.py # HybridRetrievalService + RRF logic │ │ ├── classifier_service.py # Classifier class + singleton clf │ │ ├── ingestion_service.py # IngestionService (chunking pipeline) │ │ ├── file_service.py # FileService (file I/O + metadata JSON) │ │ ├── vector_store.py # VectorStore (thin ChromaDB wrapper) │ │ ├── text_splitter.py # TextSplitter (RecursiveCharacter + variants) │ │ └── document_loader.py # (legacy loader, not in primary path) │ │ │ ├── utils/ │ │ ├── preprocessing.py # preprocess(), normalize(), preprocess_query() │ │ ├── document_helpers.py # get_references_v2(), build_metadata(), create_documents() │ │ ├── model_factory.py # get_llm_model(), get_embedding_model() │ │ ├── constants.py # stopwords list, short_words_mappings │ │ ├── embeddings.py # (thin embedding util) │ │ ├── llm_models.py # (thin LLM util) │ │ └── tests.py # classifier_test_dataset (large, 1.8MB) │ │ │ └── prompts/ │ └── __init__.py # SYSTEM_PROMPT, wrap_exaone() │ ├── ml_models/ │ ├── classifier/ │ │ └── chatbot_classifier.pkl # Pickled pipeline (models, tfidf, label encoders, etc.) │ ├── embeddings/ # (Local embedding model weights, if any) │ └── llm/ │ └── EXAONE-3.5-2.4B-*.gguf # Local LLM weights │ ├── data/ │ ├── department_data/ # Source JSON files per department │ │ ├── computer_eng.json │ │ ├── civil.json │ │ └── ... │ ├── documents/ # Persistent copies of ingested files │ ├── vector_stores/ │ │ └── classifier_test_1/ # ChromaDB persist directory │ ├── classifier_test_1.json # Ingestion metadata registry (FileService records) │ └── other_data/ # Misc data files │ ├── temp/ # Staging area for uploaded files (auto-cleared) ├── scripts/ # Offline scripts (training, testing) ├── tests/ # Test files │ ├── requirements.txt # Pinned production dependencies ├── .env # Runtime secrets (google_api_key, etc.) ├── .env.example # Template for .env └── CODEBASE_DOCUMENTATION.md # This file ``` --- *End of documentation.*