Spaces:
Sleeping
Sleeping
Refactor vector database integration to use Qdrant and update related configurations
Browse files- Replaced VectorDatabase with QdrantVectorDatabase across the application for improved vector similarity search.
- Updated docker-compose.yml to include Qdrant service with necessary configurations.
- Enhanced README.md to document Qdrant integration and its configuration options.
- Added new method in EmbeddingModel to retrieve embedding dimensions.
- Adjusted API and service files to accommodate Qdrant settings and ensure compatibility.
- README.md +24 -1
- aimakerspace/openai_utils/embedding.py +15 -0
- aimakerspace/qdrant_vectordb.py +201 -0
- api/config.py +8 -0
- api/main.py +19 -3
- api/requirements.txt +2 -1
- api/routers/document.py +11 -2
- api/services/pipeline.py +2 -2
- app.py +3 -3
- docker-compose.yml +23 -1
- scripts/run_qdrant.sh +26 -0
README.md
CHANGED
|
@@ -441,4 +441,27 @@ OPENAI_API_KEY=your_openai_api_key
|
|
| 441 |
- Document processing with text chunking
|
| 442 |
- Semantic search using embeddings
|
| 443 |
- Question answering with LLM (OpenAI models)
|
| 444 |
-
- Real-time chat interface
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 441 |
- Document processing with text chunking
|
| 442 |
- Semantic search using embeddings
|
| 443 |
- Question answering with LLM (OpenAI models)
|
| 444 |
+
- Real-time chat interface
|
| 445 |
+
|
| 446 |
+
## Vector Database
|
| 447 |
+
|
| 448 |
+
This application now uses [Qdrant](https://github.com/qdrant/qdrant-client) as its vector database.
|
| 449 |
+
Qdrant is a high-performance vector similarity search engine that stores both vectors and their metadata.
|
| 450 |
+
|
| 451 |
+
### Features:
|
| 452 |
+
- Fast vector search with HNSW index
|
| 453 |
+
- Filtering support during search
|
| 454 |
+
- Persisted storage of vectors and metadata
|
| 455 |
+
- Both in-memory and disk-based options
|
| 456 |
+
|
| 457 |
+
### Configuration:
|
| 458 |
+
The following environment variables can be used to configure Qdrant:
|
| 459 |
+
- `QDRANT_HOST`: Host of the Qdrant server (default: localhost)
|
| 460 |
+
- `QDRANT_PORT`: HTTP port of the Qdrant server (default: 6333)
|
| 461 |
+
- `QDRANT_GRPC_PORT`: gRPC port of the Qdrant server (default: 6334)
|
| 462 |
+
- `QDRANT_PREFER_GRPC`: Whether to prefer gRPC over HTTP (default: true)
|
| 463 |
+
- `QDRANT_COLLECTION`: Base name for collections (default: documents)
|
| 464 |
+
- `QDRANT_IN_MEMORY`: Whether to use in-memory storage (default: true)
|
| 465 |
+
|
| 466 |
+
When running with Docker, the application automatically connects to the Qdrant service
|
| 467 |
+
defined in the docker-compose.yml file.
|
aimakerspace/openai_utils/embedding.py
CHANGED
|
@@ -48,6 +48,21 @@ class EmbeddingModel:
|
|
| 48 |
|
| 49 |
return embedding.data[0].embedding
|
| 50 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
|
| 52 |
if __name__ == "__main__":
|
| 53 |
embedding_model = EmbeddingModel()
|
|
|
|
| 48 |
|
| 49 |
return embedding.data[0].embedding
|
| 50 |
|
| 51 |
+
def get_embedding_dimension(self) -> int:
|
| 52 |
+
"""Get the dimension of the embedding model
|
| 53 |
+
|
| 54 |
+
Returns:
|
| 55 |
+
int: Dimension of the embedding model
|
| 56 |
+
"""
|
| 57 |
+
# Dimensions for OpenAI models
|
| 58 |
+
dimensions = {
|
| 59 |
+
"text-embedding-3-small": 1536,
|
| 60 |
+
"text-embedding-3-large": 3072,
|
| 61 |
+
"text-embedding-ada-002": 1536,
|
| 62 |
+
}
|
| 63 |
+
|
| 64 |
+
return dimensions.get(self.embeddings_model_name, 1536) # Default to 1536
|
| 65 |
+
|
| 66 |
|
| 67 |
if __name__ == "__main__":
|
| 68 |
embedding_model = EmbeddingModel()
|
aimakerspace/qdrant_vectordb.py
ADDED
|
@@ -0,0 +1,201 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import numpy as np
|
| 2 |
+
from typing import List, Tuple, Callable, Dict, Any, Optional
|
| 3 |
+
import asyncio
|
| 4 |
+
import uuid
|
| 5 |
+
|
| 6 |
+
from qdrant_client import QdrantClient, AsyncQdrantClient
|
| 7 |
+
from qdrant_client.http import models
|
| 8 |
+
from qdrant_client.http.models import Distance, VectorParams, PointStruct
|
| 9 |
+
|
| 10 |
+
from aimakerspace.openai_utils.embedding import EmbeddingModel
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
class QdrantVectorDatabase:
|
| 14 |
+
"""
|
| 15 |
+
Qdrant vector database implementation that follows the same interface
|
| 16 |
+
as the in-memory VectorDatabase class.
|
| 17 |
+
"""
|
| 18 |
+
def __init__(self,
|
| 19 |
+
collection_name: str = "documents",
|
| 20 |
+
embedding_model: EmbeddingModel = None,
|
| 21 |
+
host: str = "localhost",
|
| 22 |
+
port: int = 6333,
|
| 23 |
+
grpc_port: int = 6334,
|
| 24 |
+
prefer_grpc: bool = True,
|
| 25 |
+
in_memory: bool = True):
|
| 26 |
+
"""
|
| 27 |
+
Initialize QdrantVectorDatabase
|
| 28 |
+
|
| 29 |
+
Args:
|
| 30 |
+
collection_name: Name of the collection to use
|
| 31 |
+
embedding_model: Embedding model to use
|
| 32 |
+
host: Qdrant server host
|
| 33 |
+
port: Qdrant server port
|
| 34 |
+
grpc_port: Qdrant server gRPC port
|
| 35 |
+
prefer_grpc: Whether to prefer gRPC over HTTP
|
| 36 |
+
in_memory: Whether to use in-memory storage
|
| 37 |
+
"""
|
| 38 |
+
self.collection_name = collection_name
|
| 39 |
+
self.embedding_model = embedding_model or EmbeddingModel()
|
| 40 |
+
self.in_memory = in_memory
|
| 41 |
+
|
| 42 |
+
if in_memory:
|
| 43 |
+
self.client = QdrantClient(":memory:")
|
| 44 |
+
self.async_client = AsyncQdrantClient(":memory:")
|
| 45 |
+
else:
|
| 46 |
+
self.client = QdrantClient(
|
| 47 |
+
host=host,
|
| 48 |
+
port=port,
|
| 49 |
+
grpc_port=grpc_port,
|
| 50 |
+
prefer_grpc=prefer_grpc
|
| 51 |
+
)
|
| 52 |
+
self.async_client = AsyncQdrantClient(
|
| 53 |
+
host=host,
|
| 54 |
+
port=port,
|
| 55 |
+
grpc_port=grpc_port,
|
| 56 |
+
prefer_grpc=prefer_grpc
|
| 57 |
+
)
|
| 58 |
+
|
| 59 |
+
# Store mapping from keys to ids
|
| 60 |
+
self.key_to_id: Dict[str, str] = {}
|
| 61 |
+
self.id_to_key: Dict[str, str] = {}
|
| 62 |
+
|
| 63 |
+
# Create collection if it doesn't exist
|
| 64 |
+
vector_size = self.embedding_model.get_embedding_dimension()
|
| 65 |
+
self._ensure_collection(vector_size)
|
| 66 |
+
|
| 67 |
+
def _ensure_collection(self, vector_size: int):
|
| 68 |
+
"""Ensure collection exists"""
|
| 69 |
+
collections = self.client.get_collections().collections
|
| 70 |
+
collection_names = [c.name for c in collections]
|
| 71 |
+
|
| 72 |
+
if self.collection_name not in collection_names:
|
| 73 |
+
self.client.create_collection(
|
| 74 |
+
collection_name=self.collection_name,
|
| 75 |
+
vectors_config=VectorParams(
|
| 76 |
+
size=vector_size,
|
| 77 |
+
distance=Distance.COSINE
|
| 78 |
+
)
|
| 79 |
+
)
|
| 80 |
+
|
| 81 |
+
def insert(self, key: str, vector: np.array) -> None:
|
| 82 |
+
"""Insert a vector into the database"""
|
| 83 |
+
# Generate a unique ID for this key
|
| 84 |
+
point_id = str(uuid.uuid4())
|
| 85 |
+
|
| 86 |
+
# Store the mapping
|
| 87 |
+
self.key_to_id[key] = point_id
|
| 88 |
+
self.id_to_key[point_id] = key
|
| 89 |
+
|
| 90 |
+
# Insert the point
|
| 91 |
+
self.client.upsert(
|
| 92 |
+
collection_name=self.collection_name,
|
| 93 |
+
points=[
|
| 94 |
+
PointStruct(
|
| 95 |
+
id=point_id,
|
| 96 |
+
vector=vector.tolist(),
|
| 97 |
+
payload={"text": key}
|
| 98 |
+
)
|
| 99 |
+
]
|
| 100 |
+
)
|
| 101 |
+
|
| 102 |
+
def search(
|
| 103 |
+
self,
|
| 104 |
+
query_vector: np.array,
|
| 105 |
+
k: int,
|
| 106 |
+
distance_measure: Callable = None, # Ignored, Qdrant uses its own distance measure
|
| 107 |
+
) -> List[Tuple[str, float]]:
|
| 108 |
+
"""Search for similar vectors"""
|
| 109 |
+
# Convert query_vector to list if it's a numpy array
|
| 110 |
+
if hasattr(query_vector, 'tolist'):
|
| 111 |
+
query_vector_list = query_vector.tolist()
|
| 112 |
+
else:
|
| 113 |
+
# If it's already a list or another iterable, convert to list to be safe
|
| 114 |
+
query_vector_list = list(query_vector)
|
| 115 |
+
|
| 116 |
+
search_result = self.client.search(
|
| 117 |
+
collection_name=self.collection_name,
|
| 118 |
+
query_vector=query_vector_list,
|
| 119 |
+
limit=k
|
| 120 |
+
)
|
| 121 |
+
|
| 122 |
+
results = []
|
| 123 |
+
for scored_point in search_result:
|
| 124 |
+
point_id = scored_point.id
|
| 125 |
+
score = scored_point.score
|
| 126 |
+
# Get the key from the id
|
| 127 |
+
if point_id in self.id_to_key:
|
| 128 |
+
key = self.id_to_key[point_id]
|
| 129 |
+
results.append((key, score))
|
| 130 |
+
|
| 131 |
+
return results
|
| 132 |
+
|
| 133 |
+
def search_by_text(
|
| 134 |
+
self,
|
| 135 |
+
query_text: str,
|
| 136 |
+
k: int,
|
| 137 |
+
distance_measure: Callable = None, # Ignored, Qdrant uses its own distance measure
|
| 138 |
+
return_as_text: bool = False,
|
| 139 |
+
) -> List[Tuple[str, float]]:
|
| 140 |
+
"""Search by text query"""
|
| 141 |
+
query_vector = self.embedding_model.get_embedding(query_text)
|
| 142 |
+
results = self.search(query_vector, k, distance_measure)
|
| 143 |
+
return [result[0] for result in results] if return_as_text else results
|
| 144 |
+
|
| 145 |
+
def retrieve_from_key(self, key: str) -> Optional[np.array]:
|
| 146 |
+
"""Retrieve a vector by key"""
|
| 147 |
+
if key not in self.key_to_id:
|
| 148 |
+
return None
|
| 149 |
+
|
| 150 |
+
point_id = self.key_to_id[key]
|
| 151 |
+
points = self.client.retrieve(
|
| 152 |
+
collection_name=self.collection_name,
|
| 153 |
+
ids=[point_id]
|
| 154 |
+
)
|
| 155 |
+
|
| 156 |
+
if not points:
|
| 157 |
+
return None
|
| 158 |
+
|
| 159 |
+
return np.array(points[0].vector)
|
| 160 |
+
|
| 161 |
+
async def abuild_from_list(self, list_of_text: List[str]) -> "QdrantVectorDatabase":
|
| 162 |
+
"""Build database from a list of texts"""
|
| 163 |
+
embeddings = await self.embedding_model.async_get_embeddings(list_of_text)
|
| 164 |
+
|
| 165 |
+
# Generate unique IDs for each text
|
| 166 |
+
point_ids = [str(uuid.uuid4()) for _ in range(len(list_of_text))]
|
| 167 |
+
|
| 168 |
+
# Store mappings
|
| 169 |
+
for text, point_id in zip(list_of_text, point_ids):
|
| 170 |
+
self.key_to_id[text] = point_id
|
| 171 |
+
self.id_to_key[point_id] = text
|
| 172 |
+
|
| 173 |
+
# Prepare points for batch insertion
|
| 174 |
+
points = [
|
| 175 |
+
PointStruct(
|
| 176 |
+
id=point_id,
|
| 177 |
+
vector=embedding,
|
| 178 |
+
payload={"text": text}
|
| 179 |
+
)
|
| 180 |
+
for point_id, text, embedding in zip(point_ids, list_of_text, embeddings)
|
| 181 |
+
]
|
| 182 |
+
|
| 183 |
+
# Use batched upsert for efficiency
|
| 184 |
+
batch_size = 100
|
| 185 |
+
for i in range(0, len(points), batch_size):
|
| 186 |
+
batch = points[i:i+batch_size]
|
| 187 |
+
await self.async_client.upsert(
|
| 188 |
+
collection_name=self.collection_name,
|
| 189 |
+
points=batch
|
| 190 |
+
)
|
| 191 |
+
|
| 192 |
+
return self
|
| 193 |
+
|
| 194 |
+
def get_all_texts(self) -> List[str]:
|
| 195 |
+
"""
|
| 196 |
+
Returns all the text documents stored in the vector database.
|
| 197 |
+
|
| 198 |
+
Returns:
|
| 199 |
+
List[str]: A list of all text documents
|
| 200 |
+
"""
|
| 201 |
+
return list(self.key_to_id.keys())
|
api/config.py
CHANGED
|
@@ -10,6 +10,14 @@ STATIC_DIR = "static"
|
|
| 10 |
# Settings
|
| 11 |
DEFAULT_NUM_SEARCH_RESULTS = 4 # Number of search results to return by default
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
# Get env vars with defaults
|
| 14 |
PORT = int(os.getenv("PORT", 8000))
|
| 15 |
HOST = os.getenv("HOST", "0.0.0.0")
|
|
|
|
| 10 |
# Settings
|
| 11 |
DEFAULT_NUM_SEARCH_RESULTS = 4 # Number of search results to return by default
|
| 12 |
|
| 13 |
+
# Qdrant settings
|
| 14 |
+
QDRANT_HOST = os.getenv("QDRANT_HOST", "localhost")
|
| 15 |
+
QDRANT_PORT = int(os.getenv("QDRANT_PORT", 6333))
|
| 16 |
+
QDRANT_GRPC_PORT = int(os.getenv("QDRANT_GRPC_PORT", 6334))
|
| 17 |
+
QDRANT_PREFER_GRPC = os.getenv("QDRANT_PREFER_GRPC", "True").lower() == "true"
|
| 18 |
+
QDRANT_COLLECTION = os.getenv("QDRANT_COLLECTION", "documents")
|
| 19 |
+
QDRANT_IN_MEMORY = os.getenv("QDRANT_IN_MEMORY", "True").lower() == "true"
|
| 20 |
+
|
| 21 |
# Get env vars with defaults
|
| 22 |
PORT = int(os.getenv("PORT", 8000))
|
| 23 |
HOST = os.getenv("HOST", "0.0.0.0")
|
api/main.py
CHANGED
|
@@ -16,13 +16,22 @@ from aimakerspace.openai_utils.prompts import (
|
|
| 16 |
UserRolePrompt,
|
| 17 |
SystemRolePrompt
|
| 18 |
)
|
| 19 |
-
from aimakerspace.
|
| 20 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 21 |
|
| 22 |
# API Version information
|
| 23 |
API_VERSION = "0.2.0"
|
| 24 |
BUILD_DATE = "2024-06-14" # Update this when making significant changes
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
app = FastAPI(
|
| 27 |
title="Quick Understand API",
|
| 28 |
description="RAG-based question answering API for document understanding",
|
|
@@ -199,7 +208,14 @@ async def upload_file(
|
|
| 199 |
texts = text_splitter.split_texts(documents)
|
| 200 |
|
| 201 |
# Create vector database
|
| 202 |
-
vector_db =
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
vector_db = await vector_db.abuild_from_list(texts)
|
| 204 |
|
| 205 |
# Create chat model
|
|
@@ -766,7 +782,7 @@ async def catch_all(path: str):
|
|
| 766 |
return FileResponse("static/index.html")
|
| 767 |
|
| 768 |
class RetrievalAugmentedQAPipeline:
|
| 769 |
-
def __init__(self, llm: ChatOpenAI, vector_db_retriever:
|
| 770 |
system_template: str = DEFAULT_SYSTEM_TEMPLATE,
|
| 771 |
user_template: str = DEFAULT_USER_TEMPLATE) -> None:
|
| 772 |
self.llm = llm
|
|
|
|
| 16 |
UserRolePrompt,
|
| 17 |
SystemRolePrompt
|
| 18 |
)
|
| 19 |
+
from aimakerspace.qdrant_vectordb import QdrantVectorDatabase
|
| 20 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 21 |
|
| 22 |
# API Version information
|
| 23 |
API_VERSION = "0.2.0"
|
| 24 |
BUILD_DATE = "2024-06-14" # Update this when making significant changes
|
| 25 |
|
| 26 |
+
# Qdrant settings from environment variables
|
| 27 |
+
import os
|
| 28 |
+
QDRANT_HOST = os.getenv("QDRANT_HOST", "localhost")
|
| 29 |
+
QDRANT_PORT = int(os.getenv("QDRANT_PORT", 6333))
|
| 30 |
+
QDRANT_GRPC_PORT = int(os.getenv("QDRANT_GRPC_PORT", 6334))
|
| 31 |
+
QDRANT_PREFER_GRPC = os.getenv("QDRANT_PREFER_GRPC", "True").lower() == "true"
|
| 32 |
+
QDRANT_COLLECTION = os.getenv("QDRANT_COLLECTION", "documents")
|
| 33 |
+
QDRANT_IN_MEMORY = os.getenv("QDRANT_IN_MEMORY", "True").lower() == "true"
|
| 34 |
+
|
| 35 |
app = FastAPI(
|
| 36 |
title="Quick Understand API",
|
| 37 |
description="RAG-based question answering API for document understanding",
|
|
|
|
| 208 |
texts = text_splitter.split_texts(documents)
|
| 209 |
|
| 210 |
# Create vector database
|
| 211 |
+
vector_db = QdrantVectorDatabase(
|
| 212 |
+
collection_name=f"{QDRANT_COLLECTION}_{session_id}",
|
| 213 |
+
host=QDRANT_HOST,
|
| 214 |
+
port=QDRANT_PORT,
|
| 215 |
+
grpc_port=QDRANT_GRPC_PORT,
|
| 216 |
+
prefer_grpc=QDRANT_PREFER_GRPC,
|
| 217 |
+
in_memory=QDRANT_IN_MEMORY
|
| 218 |
+
)
|
| 219 |
vector_db = await vector_db.abuild_from_list(texts)
|
| 220 |
|
| 221 |
# Create chat model
|
|
|
|
| 782 |
return FileResponse("static/index.html")
|
| 783 |
|
| 784 |
class RetrievalAugmentedQAPipeline:
|
| 785 |
+
def __init__(self, llm: ChatOpenAI, vector_db_retriever: QdrantVectorDatabase,
|
| 786 |
system_template: str = DEFAULT_SYSTEM_TEMPLATE,
|
| 787 |
user_template: str = DEFAULT_USER_TEMPLATE) -> None:
|
| 788 |
self.llm = llm
|
api/requirements.txt
CHANGED
|
@@ -6,4 +6,5 @@ openai==1.59.9
|
|
| 6 |
pydantic==2.10.1
|
| 7 |
pypdf2==3.0.1
|
| 8 |
python-dotenv==1.0.1
|
| 9 |
-
websockets==14.2
|
|
|
|
|
|
| 6 |
pydantic==2.10.1
|
| 7 |
pypdf2==3.0.1
|
| 8 |
python-dotenv==1.0.1
|
| 9 |
+
websockets==14.2
|
| 10 |
+
qdrant-client==1.13.0
|
api/routers/document.py
CHANGED
|
@@ -5,8 +5,10 @@ from fastapi import APIRouter, UploadFile, File, Form, HTTPException, Request, R
|
|
| 5 |
from typing import Dict, List
|
| 6 |
|
| 7 |
from aimakerspace.text_utils import CharacterTextSplitter, TextFileLoader, PDFLoader
|
|
|
|
| 8 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 9 |
-
from aimakerspace.
|
|
|
|
| 10 |
|
| 11 |
from api.models.pydantic_models import DocumentSummaryRequest, DocumentSummaryResponse
|
| 12 |
from api.services.pipeline import RetrievalAugmentedQAPipeline
|
|
@@ -66,7 +68,14 @@ async def upload_file(
|
|
| 66 |
texts = text_splitter.split_texts(documents)
|
| 67 |
|
| 68 |
# Create vector database
|
| 69 |
-
vector_db =
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
vector_db = await vector_db.abuild_from_list(texts)
|
| 71 |
|
| 72 |
# Create chat model
|
|
|
|
| 5 |
from typing import Dict, List
|
| 6 |
|
| 7 |
from aimakerspace.text_utils import CharacterTextSplitter, TextFileLoader, PDFLoader
|
| 8 |
+
from aimakerspace.openai_utils.embedding import EmbeddingModel
|
| 9 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 10 |
+
from aimakerspace.qdrant_vectordb import QdrantVectorDatabase
|
| 11 |
+
from api.config import QDRANT_HOST, QDRANT_PORT, QDRANT_GRPC_PORT, QDRANT_PREFER_GRPC, QDRANT_COLLECTION, QDRANT_IN_MEMORY
|
| 12 |
|
| 13 |
from api.models.pydantic_models import DocumentSummaryRequest, DocumentSummaryResponse
|
| 14 |
from api.services.pipeline import RetrievalAugmentedQAPipeline
|
|
|
|
| 68 |
texts = text_splitter.split_texts(documents)
|
| 69 |
|
| 70 |
# Create vector database
|
| 71 |
+
vector_db = QdrantVectorDatabase(
|
| 72 |
+
collection_name=f"{QDRANT_COLLECTION}_{session_id}",
|
| 73 |
+
host=QDRANT_HOST,
|
| 74 |
+
port=QDRANT_PORT,
|
| 75 |
+
grpc_port=QDRANT_GRPC_PORT,
|
| 76 |
+
prefer_grpc=QDRANT_PREFER_GRPC,
|
| 77 |
+
in_memory=QDRANT_IN_MEMORY
|
| 78 |
+
)
|
| 79 |
vector_db = await vector_db.abuild_from_list(texts)
|
| 80 |
|
| 81 |
# Create chat model
|
api/services/pipeline.py
CHANGED
|
@@ -1,10 +1,10 @@
|
|
| 1 |
from typing import List, Dict, Any
|
| 2 |
from aimakerspace.openai_utils.prompts import SystemRolePrompt, UserRolePrompt
|
| 3 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 4 |
-
from aimakerspace.
|
| 5 |
|
| 6 |
class RetrievalAugmentedQAPipeline:
|
| 7 |
-
def __init__(self, llm: ChatOpenAI, vector_db_retriever:
|
| 8 |
system_template: str,
|
| 9 |
user_template: str) -> None:
|
| 10 |
self.llm = llm
|
|
|
|
| 1 |
from typing import List, Dict, Any
|
| 2 |
from aimakerspace.openai_utils.prompts import SystemRolePrompt, UserRolePrompt
|
| 3 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 4 |
+
from aimakerspace.qdrant_vectordb import QdrantVectorDatabase
|
| 5 |
|
| 6 |
class RetrievalAugmentedQAPipeline:
|
| 7 |
+
def __init__(self, llm: ChatOpenAI, vector_db_retriever: QdrantVectorDatabase,
|
| 8 |
system_template: str,
|
| 9 |
user_template: str) -> None:
|
| 10 |
self.llm = llm
|
app.py
CHANGED
|
@@ -8,7 +8,7 @@ from aimakerspace.openai_utils.prompts import (
|
|
| 8 |
AssistantRolePrompt,
|
| 9 |
)
|
| 10 |
from aimakerspace.openai_utils.embedding import EmbeddingModel
|
| 11 |
-
from aimakerspace.
|
| 12 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 13 |
import chainlit as cl
|
| 14 |
|
|
@@ -26,7 +26,7 @@ Question:
|
|
| 26 |
user_role_prompt = UserRolePrompt(user_prompt_template)
|
| 27 |
|
| 28 |
class RetrievalAugmentedQAPipeline:
|
| 29 |
-
def __init__(self, llm: ChatOpenAI(), vector_db_retriever:
|
| 30 |
self.llm = llm
|
| 31 |
self.vector_db_retriever = vector_db_retriever
|
| 32 |
|
|
@@ -108,7 +108,7 @@ async def on_chat_start():
|
|
| 108 |
print(f"Processing {len(texts)} text chunks")
|
| 109 |
|
| 110 |
# Create a dict vector store
|
| 111 |
-
vector_db =
|
| 112 |
vector_db = await vector_db.abuild_from_list(texts)
|
| 113 |
|
| 114 |
chat_openai = ChatOpenAI()
|
|
|
|
| 8 |
AssistantRolePrompt,
|
| 9 |
)
|
| 10 |
from aimakerspace.openai_utils.embedding import EmbeddingModel
|
| 11 |
+
from aimakerspace.qdrant_vectordb import QdrantVectorDatabase
|
| 12 |
from aimakerspace.openai_utils.chatmodel import ChatOpenAI
|
| 13 |
import chainlit as cl
|
| 14 |
|
|
|
|
| 26 |
user_role_prompt = UserRolePrompt(user_prompt_template)
|
| 27 |
|
| 28 |
class RetrievalAugmentedQAPipeline:
|
| 29 |
+
def __init__(self, llm: ChatOpenAI(), vector_db_retriever: QdrantVectorDatabase) -> None:
|
| 30 |
self.llm = llm
|
| 31 |
self.vector_db_retriever = vector_db_retriever
|
| 32 |
|
|
|
|
| 108 |
print(f"Processing {len(texts)} text chunks")
|
| 109 |
|
| 110 |
# Create a dict vector store
|
| 111 |
+
vector_db = QdrantVectorDatabase(collection_name="chainlit_documents")
|
| 112 |
vector_db = await vector_db.abuild_from_list(texts)
|
| 113 |
|
| 114 |
chat_openai = ChatOpenAI()
|
docker-compose.yml
CHANGED
|
@@ -1,6 +1,17 @@
|
|
| 1 |
version: '3.8'
|
| 2 |
|
| 3 |
services:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
rag-app:
|
| 5 |
build:
|
| 6 |
context: .
|
|
@@ -12,6 +23,17 @@ services:
|
|
| 12 |
- OPENAI_API_KEY=${OPENAI_API_KEY}
|
| 13 |
- PORT=7860
|
| 14 |
- HOST=0.0.0.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
env_file:
|
| 16 |
- .env
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
version: '3.8'
|
| 2 |
|
| 3 |
services:
|
| 4 |
+
qdrant:
|
| 5 |
+
image: qdrant/qdrant:latest
|
| 6 |
+
ports:
|
| 7 |
+
- "6333:6333"
|
| 8 |
+
- "6334:6334"
|
| 9 |
+
volumes:
|
| 10 |
+
- qdrant_data:/qdrant/storage
|
| 11 |
+
environment:
|
| 12 |
+
- QDRANT_ALLOW_ORIGIN=*
|
| 13 |
+
restart: unless-stopped
|
| 14 |
+
|
| 15 |
rag-app:
|
| 16 |
build:
|
| 17 |
context: .
|
|
|
|
| 23 |
- OPENAI_API_KEY=${OPENAI_API_KEY}
|
| 24 |
- PORT=7860
|
| 25 |
- HOST=0.0.0.0
|
| 26 |
+
- QDRANT_HOST=qdrant
|
| 27 |
+
- QDRANT_PORT=6333
|
| 28 |
+
- QDRANT_GRPC_PORT=6334
|
| 29 |
+
- QDRANT_PREFER_GRPC=true
|
| 30 |
+
- QDRANT_COLLECTION=documents
|
| 31 |
+
- QDRANT_IN_MEMORY=false
|
| 32 |
env_file:
|
| 33 |
- .env
|
| 34 |
+
depends_on:
|
| 35 |
+
- qdrant
|
| 36 |
+
restart: unless-stopped
|
| 37 |
+
|
| 38 |
+
volumes:
|
| 39 |
+
qdrant_data:
|
scripts/run_qdrant.sh
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
echo "Starting Qdrant vector database..."
|
| 4 |
+
|
| 5 |
+
# Check if container is already running
|
| 6 |
+
if docker ps | grep -q "qdrant-server"; then
|
| 7 |
+
echo "Qdrant is already running."
|
| 8 |
+
else
|
| 9 |
+
# Create a Docker volume for persistence
|
| 10 |
+
docker volume create qdrant_data
|
| 11 |
+
|
| 12 |
+
# Run Qdrant
|
| 13 |
+
docker run -d --name qdrant-server \
|
| 14 |
+
-p 6333:6333 \
|
| 15 |
+
-p 6334:6334 \
|
| 16 |
+
-v qdrant_data:/qdrant/storage \
|
| 17 |
+
-e QDRANT_ALLOW_ORIGIN="*" \
|
| 18 |
+
qdrant/qdrant:latest
|
| 19 |
+
|
| 20 |
+
echo "Qdrant started on ports 6333 (HTTP) and 6334 (gRPC)"
|
| 21 |
+
fi
|
| 22 |
+
|
| 23 |
+
echo "Qdrant is now available at http://localhost:6333"
|
| 24 |
+
echo "Use Ctrl+C to exit this script (Qdrant will continue running in the background)"
|
| 25 |
+
echo "To stop Qdrant later, run: docker stop qdrant-server"
|
| 26 |
+
echo "To remove the container, run: docker rm qdrant-server"
|