Spaces:

Zwounds
/

LibraryRAG

Sleeping

Zwounds commited on Apr 1, 2025

Commit

cc432be

verified ·

1 Parent(s): 01afcca

Upload README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -103,18 +103,18 @@ This Space demonstrates a Retrieval-Augmented Generation (RAG) application built
 **How it works:**
-1.  **Data Source:** Content extracted from LibGuides (`extracted_content.jsonl`).
-2.  **Embedding:** On first startup, the application uses the `BAAI/bge-m3` sentence transformer model (run locally within the Space) to embed the LibGuides content and stores it in a ChromaDB vector database (`./chroma_db`). This database persists if the Space uses persistent storage.
 3.  **Query Processing:**
-    *   User queries are optionally expanded using the generation model.
-    *   Queries are embedded using the same local `BAAI/bge-m3` model (handled internally by ChromaDB).
-    *   ChromaDB performs a similarity search to find relevant text chunks.
 4.  **Generation:** The relevant chunks and the original query are passed to the `google/gemma-3-27b-it` model via the Hugging Face Inference API to generate a final answer.
 **Configuration:**
-*   **Embedding Model:** `BAAI/bge-m3` (local via `sentence-transformers` & ChromaDB)
-*   **Generation Model:** `google/gemma-3-27b-it` (via HF Inference API)
 *   **Requires Secret:** A Hugging Face User Access Token must be added as a Space Secret named `HF_TOKEN`.
-**Note:** The initial embedding process when the Space first starts (or restarts without persistent storage) can take some time as the model needs to process all the documents.

 **How it works:**
+1.  **Data Source:** Pre-computed embeddings (`BAAI/bge-m3`), documents, and metadata loaded from the Hugging Face Dataset `Zwounds/Libguides_Embeddings` (originally sourced from `extracted_content.jsonl`).
+2.  **Database Initialization:** On startup, the application downloads the dataset and loads the data into an in-memory ChromaDB collection stored in a temporary directory. This avoids slow re-embedding on every startup.
 3.  **Query Processing:**
+    *   User queries are optionally expanded using the generation model (`google/gemma-3-27b-it` via HF API).
+    *   Queries are embedded using the local `BAAI/bge-m3` model (loaded into the Space).
+    *   ChromaDB performs a similarity search using the query embedding against the pre-computed document embeddings.
 4.  **Generation:** The relevant chunks and the original query are passed to the `google/gemma-3-27b-it` model via the Hugging Face Inference API to generate a final answer.
 **Configuration:**
+*   **Embedding:** Pre-computed `BAAI/bge-m3` embeddings loaded from HF Dataset `Zwounds/Libguides_Embeddings`. Query embedding uses local `BAAI/bge-m3`.
+*   **Generation Model:** `google/gemma-3-27b-it` (via HF Inference API).
 *   **Requires Secret:** A Hugging Face User Access Token must be added as a Space Secret named `HF_TOKEN`.
+**Note:** Startup involves downloading the dataset and loading it into the ChromaDB collection, which is much faster than re-embedding all documents.