| |
|
| |
|
| | This document outlines the API endpoints for managing Retrieval-Augmented Generation (RAG) components in PySpur.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| | **Description**: Creates a new document collection from uploaded files and metadata. The files are processed asynchronously in the background.
|
| |
|
| | **URL**: `/rag/collections/`
|
| |
|
| | **Method**: POST
|
| |
|
| | **Form Data**:
|
| | ```python
|
| | files: List[UploadFile]
|
| | metadata: str
|
| | ```
|
| |
|
| | Where `metadata` is a JSON string representing:
|
| | ```python
|
| | class DocumentCollectionCreateSchema:
|
| | name: str
|
| | description: str
|
| | text_processing: ChunkingConfigSchema
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class DocumentCollectionResponseSchema:
|
| | id: str
|
| | name: str
|
| | description: str
|
| | status: str
|
| | created_at: str
|
| | updated_at: str
|
| | document_count: int
|
| | chunk_count: int
|
| | error_message: Optional[str]
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Lists all document collections.
|
| |
|
| | **URL**: `/rag/collections/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | List[DocumentCollectionResponseSchema]
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Gets details of a specific document collection.
|
| |
|
| | **URL**: `/rag/collections/{collection_id}/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | collection_id: str
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class DocumentCollectionResponseSchema:
|
| | id: str
|
| | name: str
|
| | description: str
|
| | status: str
|
| | created_at: str
|
| | updated_at: str
|
| | document_count: int
|
| | chunk_count: int
|
| | error_message: Optional[str]
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Deletes a document collection and its associated data.
|
| |
|
| | **URL**: `/rag/collections/{collection_id}/`
|
| |
|
| | **Method**: DELETE
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | collection_id: str
|
| | ```
|
| |
|
| | **Response**: 200 OK with message
|
| |
|
| |
|
| |
|
| | **Description**: Gets the processing progress of a document collection.
|
| |
|
| | **URL**: `/rag/collections/{collection_id}/progress/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | collection_id: str
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class ProcessingProgressSchema:
|
| | id: str
|
| | status: str
|
| | progress: float
|
| | current_step: Optional[str]
|
| | total_files: Optional[int]
|
| | processed_files: Optional[int]
|
| | total_chunks: Optional[int]
|
| | processed_chunks: Optional[int]
|
| | error_message: Optional[str]
|
| | created_at: str
|
| | updated_at: str
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Adds documents to an existing collection. The documents are processed asynchronously in the background.
|
| |
|
| | **URL**: `/rag/collections/{collection_id}/documents/`
|
| |
|
| | **Method**: POST
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | collection_id: str
|
| | ```
|
| |
|
| | **Form Data**:
|
| | ```python
|
| | files: List[UploadFile]
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class DocumentCollectionResponseSchema:
|
| |
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Gets all documents and their chunks for a collection.
|
| |
|
| | **URL**: `/rag/collections/{collection_id}/documents/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | collection_id: str
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | List[DocumentWithChunksSchema]
|
| | ```
|
| |
|
| | Where `DocumentWithChunksSchema` contains:
|
| | ```python
|
| | class DocumentWithChunksSchema:
|
| | id: str
|
| | title: str
|
| | metadata: Dict[str, Any]
|
| | chunks: List[DocumentChunkSchema]
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Deletes a document from a collection.
|
| |
|
| | **URL**: `/rag/collections/{collection_id}/documents/{document_id}/`
|
| |
|
| | **Method**: DELETE
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | collection_id: str
|
| | document_id: str
|
| | ```
|
| |
|
| | **Response**: 200 OK with message
|
| |
|
| |
|
| |
|
| | **Description**: Previews how a document would be chunked with a given configuration.
|
| |
|
| | **URL**: `/rag/collections/preview_chunk/`
|
| |
|
| | **Method**: POST
|
| |
|
| | **Form Data**:
|
| | ```python
|
| | file: UploadFile
|
| | chunking_config: str
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | {
|
| | "chunks": List[Dict[str, Any]],
|
| | "total_chunks": int
|
| | }
|
| | ```
|
| |
|
| |
|
| |
|
| |
|
| |
|
| | **Description**: Creates a new vector index from a document collection. The index is created asynchronously in the background.
|
| |
|
| | **URL**: `/rag/indices/`
|
| |
|
| | **Method**: POST
|
| |
|
| | **Request Payload**:
|
| | ```python
|
| | class VectorIndexCreateSchema:
|
| | name: str
|
| | description: str
|
| | collection_id: str
|
| | embedding: EmbeddingConfigSchema
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class VectorIndexResponseSchema:
|
| | id: str
|
| | name: str
|
| | description: str
|
| | collection_id: str
|
| | status: str
|
| | created_at: str
|
| | updated_at: str
|
| | document_count: int
|
| | chunk_count: int
|
| | embedding_model: str
|
| | vector_db: str
|
| | error_message: Optional[str]
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Lists all vector indices.
|
| |
|
| | **URL**: `/rag/indices/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | List[VectorIndexResponseSchema]
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Gets details of a specific vector index.
|
| |
|
| | **URL**: `/rag/indices/{index_id}/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | index_id: str
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class VectorIndexResponseSchema:
|
| |
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Deletes a vector index and its associated data.
|
| |
|
| | **URL**: `/rag/indices/{index_id}/`
|
| |
|
| | **Method**: DELETE
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | index_id: str
|
| | ```
|
| |
|
| | **Response**: 200 OK with message
|
| |
|
| |
|
| |
|
| | **Description**: Gets the processing progress of a vector index.
|
| |
|
| | **URL**: `/rag/indices/{index_id}/progress/`
|
| |
|
| | **Method**: GET
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | index_id: str
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class ProcessingProgressSchema:
|
| |
|
| | ```
|
| |
|
| |
|
| |
|
| | **Description**: Retrieves relevant chunks from a vector index based on a query.
|
| |
|
| | **URL**: `/rag/indices/{index_id}/retrieve/`
|
| |
|
| | **Method**: POST
|
| |
|
| | **Parameters**:
|
| | ```python
|
| | index_id: str
|
| | ```
|
| |
|
| | **Request Payload**:
|
| | ```python
|
| | class RetrievalRequestSchema:
|
| | query: str
|
| | top_k: Optional[int] = 5
|
| | score_threshold: Optional[float] = None
|
| | semantic_weight: Optional[float] = 1.0
|
| | keyword_weight: Optional[float] = 0.0
|
| | ```
|
| |
|
| | **Response Schema**:
|
| | ```python
|
| | class RetrievalResponseSchema:
|
| | results: List[RetrievalResultSchema]
|
| | total_results: int
|
| | ```
|
| |
|
| | Where `RetrievalResultSchema` contains:
|
| | ```python
|
| | class RetrievalResultSchema:
|
| | text: str
|
| | score: float
|
| | metadata: ChunkMetadataSchema
|
| | ``` |