Refactoring Quadrnt and Backend Startup

Files changed (10) hide show

Dockerfile +3 -0
SUMMARY.md +60 -0
docker-compose.yml +2 -0
scripts/ollama_entrypoint.sh +4 -0
scripts/wait-for-qdrant.sh +17 -0
src/__pycache__/main.cpython-311.pyc +0 -0
src/core/__pycache__/llm.cpython-311.pyc +0 -0
src/core/__pycache__/models.cpython-311.pyc +0 -0
src/core/__pycache__/processing.cpython-311.pyc +0 -0
src/core/__pycache__/vector_store.cpython-311.pyc +0 -0

Dockerfile CHANGED Viewed

@@ -2,6 +2,9 @@
 # Use an official Python runtime as a parent image
 FROM python:3.11-slim
 # Set the working directory in the container
 WORKDIR /app

 # Use an official Python runtime as a parent image
 FROM python:3.11-slim
+# Install curl for the wait script
+RUN apt-get update && apt-get install -y curl
 # Set the working directory in the container
 WORKDIR /app

SUMMARY.md ADDED Viewed

	@@ -0,0 +1,60 @@

+# Project Summary: Phases 1 & 2
+This document summarizes the work completed in the first two phases of the RAG Knowledge Assistant project.
+---
+## Phase 1: Research & Setup
+Phase 1 focused on establishing a fully containerized and automated local development environment.
+### Key Achievements:
+1.  **Project Structure:**
+    -   `src/`: Contains all the Python source code for the backend API.
+    -   `uploads/`: A directory for temporarily storing uploaded files during processing.
+    -   `scripts/`: Holds utility scripts, such as the automated model puller for Ollama.
+2.  **Dependency Management:**
+    -   A `requirements.txt` file was created to manage all Python dependencies, including FastAPI, LangChain, Qdrant, and Sentence-Transformers.
+3.  **Containerization with Docker:**
+    -   A `Dockerfile` was written to create a container image for our FastAPI application.
+    -   A `docker-compose.yml` file orchestrates all the necessary services:
+        -   `backend`: Our FastAPI application.
+        -   `qdrant`: The vector database for storing document embeddings.
+        -   `ollama`: The service for running the open-source LLM.
+4.  **Automated Model Pulling:**
+    -   An entrypoint script (`scripts/ollama_entrypoint.sh`) was created to automatically pull the `llama3` model when the Ollama container starts. This ensures the LLM is ready without manual intervention.
+---
+## Phase 2: Backend API MVP
+Phase 2 focused on building the core functionality of the knowledge assistant, resulting in a functional RAG pipeline accessible via a REST API.
+### Key Achievements:
+1.  **Modular Codebase:**
+    -   The `src/core/` directory was created to organize the application's business logic into separate, manageable modules:
+        -   `processing.py`: Handles PDF parsing, text chunking, and embedding model loading.
+        -   `vector_store.py`: Manages all interactions with the Qdrant database (creation, upserting, searching).
+        -   `llm.py`: Handles all interactions with the Ollama LLM service (prompt formatting, response generation).
+        -   `models.py`: Defines the Pydantic models for API request and response data structures.
+2.  **API Endpoints Implemented:**
+    -   **`GET /health`**: A simple endpoint to confirm that the API is running.
+    -   **`POST /upload`**: Implements the full document ingestion pipeline:
+        1.  Receives and validates a PDF file.
+        2.  Extracts text using `PyMuPDF`.
+        3.  Splits the text into smaller, overlapping chunks using `LangChain`.
+        4.  Generates vector embeddings for each chunk using `sentence-transformers`.
+        5.  Upserts the chunks and their embeddings into the Qdrant database.
+    -   **`POST /query`**: Implements the complete RAG pipeline to answer questions:
+        1.  Receives a JSON object with a `query` string.
+        2.  Generates an embedding for the query.
+        3.  Searches Qdrant to retrieve the most relevant document chunks (Retrieval).
+        4.  Constructs a detailed prompt containing the user's query and the retrieved context.
+        5.  Sends the prompt to the `llama3` model via Ollama to get an answer (Augmented Generation).
+        6.  Returns the generated answer along with the source documents used for context.

docker-compose.yml CHANGED Viewed

@@ -8,12 +8,14 @@ services:
       - "8000:8000"
     volumes:
       - ./src:/app/src
     depends_on:
       - qdrant
       - ollama
     environment:
       - QDRANT_HOST=qdrant
       - OLLAMA_HOST=ollama
   qdrant:
     image: qdrant/qdrant:latest

       - "8000:8000"
     volumes:
       - ./src:/app/src
+      - ./scripts:/app/scripts
     depends_on:
       - qdrant
       - ollama
     environment:
       - QDRANT_HOST=qdrant
       - OLLAMA_HOST=ollama
+    entrypoint: ["/app/scripts/wait-for-qdrant.sh", "qdrant:6333", "--", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
   qdrant:
     image: qdrant/qdrant:latest

scripts/ollama_entrypoint.sh CHANGED Viewed

@@ -1,5 +1,9 @@
 #!/bin/bash
 # Start Ollama in the background
 /bin/ollama serve &

 #!/bin/bash
+# The base ollama image doesn't have curl, so we install it.
+# We'll attempt to update and install curl. This requires root privileges.
+apt-get update && apt-get install -y curl
 # Start Ollama in the background
 /bin/ollama serve &

scripts/wait-for-qdrant.sh ADDED Viewed

	@@ -0,0 +1,17 @@

+#!/bin/sh
+# wait-for-qdrant.sh
+set -e
+host="$1"
+shift
+cmd="$@"
+# Loop until the Qdrant health check endpoint is reachable
+until curl -s -f "$host/healthz" > /dev/null; do
+  >&2 echo "Qdrant is unavailable - sleeping"
+  sleep 1
+done
+>&2 echo "Qdrant is up - executing command"
+exec $cmd

src/__pycache__/main.cpython-311.pyc CHANGED Viewed

Binary files a/src/__pycache__/main.cpython-311.pyc and b/src/__pycache__/main.cpython-311.pyc differ

src/core/__pycache__/llm.cpython-311.pyc ADDED Viewed

Binary file (1.96 kB). View file

src/core/__pycache__/models.cpython-311.pyc ADDED Viewed

Binary file (791 Bytes). View file

src/core/__pycache__/processing.cpython-311.pyc ADDED Viewed

Binary file (1.49 kB). View file

src/core/__pycache__/vector_store.cpython-311.pyc ADDED Viewed

Binary file (2.25 kB). View file