Comprehensive update: Modal.com and Nebius AI integration documentation
Browse files- Add detailed explanation of Modal.com purpose: distributed serverless computing for heavy AI workloads
- Document Nebius AI role: advanced language intelligence and embedding generation
- Include specific Modal endpoints and their functions (OCR, FAISS, batch processing)
- Add integrated workflow architecture showing how both services work together
- Update API reference with Modal integration endpoints
- Include performance metrics for both platforms with realistic response times
- Add failover strategies and graceful degradation capabilities
- Include live Modal app links for testing and documentation
- Document resource allocation (2-4GB memory, CPU scaling for Modal functions)
- Add comprehensive service architecture explanation with clear separation of concerns
|
@@ -83,10 +83,15 @@ KnowledgeBridge demonstrates sophisticated AI agent orchestration through multi-
|
|
| 83 |
- **Helmet.js** for security headers
|
| 84 |
|
| 85 |
### **AI & Processing**
|
| 86 |
-
- **
|
| 87 |
-
- **
|
| 88 |
-
- **
|
| 89 |
-
- **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
|
| 91 |
## π Quick Start
|
| 92 |
|
|
@@ -101,7 +106,7 @@ NEBIUS_API_KEY=your_nebius_api_key_here
|
|
| 101 |
# Modal Configuration (Optional - for advanced processing)
|
| 102 |
MODAL_TOKEN_ID=your_modal_token_id
|
| 103 |
MODAL_TOKEN_SECRET=your_modal_token_secret
|
| 104 |
-
MODAL_BASE_URL=
|
| 105 |
|
| 106 |
# GitHub Configuration (Optional - for repository search)
|
| 107 |
GITHUB_TOKEN=your_github_token_here
|
|
@@ -183,25 +188,76 @@ POST /api/embeddings
|
|
| 183 |
}
|
| 184 |
```
|
| 185 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 186 |
### **Health Check**
|
| 187 |
```typescript
|
| 188 |
GET /api/health
|
| 189 |
-
// Returns comprehensive health status of all services
|
|
|
|
|
|
|
|
|
|
| 190 |
```
|
| 191 |
|
| 192 |
## π Performance & Reliability
|
| 193 |
|
| 194 |
### **Response Times**
|
| 195 |
-
- Local search
|
| 196 |
-
-
|
| 197 |
-
-
|
| 198 |
-
- Embedding generation: ~500ms-1s per request
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 199 |
|
| 200 |
### **Scalability Features**
|
| 201 |
-
- Rate limiting prevents API abuse
|
| 202 |
-
-
|
| 203 |
-
-
|
| 204 |
-
-
|
|
|
|
|
|
|
|
|
|
| 205 |
|
| 206 |
### **Error Handling**
|
| 207 |
- React Error Boundaries prevent UI crashes
|
|
@@ -260,11 +316,61 @@ npm run build
|
|
| 260 |
|
| 261 |
## π Architecture Highlights
|
| 262 |
|
| 263 |
-
### **AI Integration**
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
|
| 267 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 268 |
|
| 269 |
### **Data Flow**
|
| 270 |
1. User query β AI query enhancement (optional)
|
|
@@ -321,10 +427,20 @@ MIT License - see [LICENSE](LICENSE) file for details.
|
|
| 321 |
|
| 322 |
## π Related Resources
|
| 323 |
|
| 324 |
-
|
| 325 |
-
- [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 326 |
- [React Query Documentation](https://tanstack.com/query/latest)
|
| 327 |
- [Radix UI Components](https://www.radix-ui.com/)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 328 |
|
| 329 |
---
|
| 330 |
|
|
|
|
| 83 |
- **Helmet.js** for security headers
|
| 84 |
|
| 85 |
### **AI & Processing**
|
| 86 |
+
- **Nebius AI Platform** - Advanced LLM and embedding capabilities
|
| 87 |
+
- **DeepSeek-R1-0528** for chat completions and document analysis
|
| 88 |
+
- **BAAI/bge-en-icl** for embedding generation (1536 dimensions)
|
| 89 |
+
- **Query Enhancement** and intelligent content analysis
|
| 90 |
+
- **Modal.com Integration** - Distributed serverless computing
|
| 91 |
+
- **Heavy compute workloads** (OCR, vector indexing)
|
| 92 |
+
- **FAISS vector search** for high-performance similarity matching
|
| 93 |
+
- **Scalable document processing** with 2-4GB memory allocation
|
| 94 |
+
- **Smart Ingestion Service** for coordinated AI pipeline processing
|
| 95 |
|
| 96 |
## π Quick Start
|
| 97 |
|
|
|
|
| 106 |
# Modal Configuration (Optional - for advanced processing)
|
| 107 |
MODAL_TOKEN_ID=your_modal_token_id
|
| 108 |
MODAL_TOKEN_SECRET=your_modal_token_secret
|
| 109 |
+
MODAL_BASE_URL=https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run
|
| 110 |
|
| 111 |
# GitHub Configuration (Optional - for repository search)
|
| 112 |
GITHUB_TOKEN=your_github_token_here
|
|
|
|
| 188 |
}
|
| 189 |
```
|
| 190 |
|
| 191 |
+
### **Modal Integration Endpoints**
|
| 192 |
+
```typescript
|
| 193 |
+
POST /api/modal/vector-search
|
| 194 |
+
{
|
| 195 |
+
query: string;
|
| 196 |
+
index_name?: string;
|
| 197 |
+
max_results?: number;
|
| 198 |
+
}
|
| 199 |
+
|
| 200 |
+
POST /api/modal/extract-text
|
| 201 |
+
{
|
| 202 |
+
documents: Array<{
|
| 203 |
+
id: string;
|
| 204 |
+
content: string; // base64 for PDFs/images
|
| 205 |
+
contentType: string;
|
| 206 |
+
}>;
|
| 207 |
+
}
|
| 208 |
+
|
| 209 |
+
POST /api/modal/build-index
|
| 210 |
+
{
|
| 211 |
+
documents: Array<{
|
| 212 |
+
id: string;
|
| 213 |
+
content: string;
|
| 214 |
+
title?: string;
|
| 215 |
+
source?: string;
|
| 216 |
+
}>;
|
| 217 |
+
index_name?: string;
|
| 218 |
+
}
|
| 219 |
+
|
| 220 |
+
POST /api/modal/batch-process
|
| 221 |
+
{
|
| 222 |
+
documents: DocumentArray;
|
| 223 |
+
operations: ["extract_text", "build_index"];
|
| 224 |
+
index_name?: string;
|
| 225 |
+
}
|
| 226 |
+
```
|
| 227 |
+
|
| 228 |
### **Health Check**
|
| 229 |
```typescript
|
| 230 |
GET /api/health
|
| 231 |
+
// Returns comprehensive health status of all services including:
|
| 232 |
+
// - Nebius AI (embeddings, chat completions)
|
| 233 |
+
// - Modal.com (API connectivity, function availability)
|
| 234 |
+
// - External APIs (GitHub, Wikipedia, ArXiv)
|
| 235 |
```
|
| 236 |
|
| 237 |
## π Performance & Reliability
|
| 238 |
|
| 239 |
### **Response Times**
|
| 240 |
+
- **Local search**: <100ms for semantic queries
|
| 241 |
+
- **Nebius AI operations**:
|
| 242 |
+
- Document analysis: ~3-5 seconds depending on content length
|
| 243 |
+
- Embedding generation: ~500ms-1s per request
|
| 244 |
+
- Query enhancement: ~1-2 seconds
|
| 245 |
+
- **Modal.com operations**:
|
| 246 |
+
- Vector search: ~2-4 seconds (including cold start)
|
| 247 |
+
- OCR text extraction: ~5-10 seconds per document
|
| 248 |
+
- FAISS index building: ~10-30 seconds depending on document count
|
| 249 |
+
- Batch processing: Scales with document volume (parallel execution)
|
| 250 |
+
- **External services**:
|
| 251 |
+
- URL validation: <2 seconds per URL with concurrent processing
|
| 252 |
|
| 253 |
### **Scalability Features**
|
| 254 |
+
- **Rate limiting** prevents API abuse across all endpoints
|
| 255 |
+
- **Modal.com serverless scaling**: Automatic resource allocation (2-4GB memory, 2+ CPU cores)
|
| 256 |
+
- **Concurrent processing**: Parallel URL validation and document processing
|
| 257 |
+
- **Intelligent caching**: Repeated queries cached for improved performance
|
| 258 |
+
- **Distributed storage**: Modal volumes for persistent vector indices
|
| 259 |
+
- **Graceful degradation**: Falls back to local processing when cloud services unavailable
|
| 260 |
+
- **Load balancing**: Distributes workload between Nebius AI and Modal compute resources
|
| 261 |
|
| 262 |
### **Error Handling**
|
| 263 |
- React Error Boundaries prevent UI crashes
|
|
|
|
| 316 |
|
| 317 |
## π Architecture Highlights
|
| 318 |
|
| 319 |
+
### **AI Integration & Service Architecture**
|
| 320 |
+
|
| 321 |
+
#### **π§ Nebius AI Platform** - Advanced Language Intelligence
|
| 322 |
+
**Purpose**: Primary AI service for language understanding and content analysis
|
| 323 |
+
|
| 324 |
+
**Core Functions**:
|
| 325 |
+
- **LLM Operations**: DeepSeek-R1-0528 model for chat completions and document analysis
|
| 326 |
+
- **Embedding Generation**: BAAI/bge-en-icl model producing 1536-dimensional vectors
|
| 327 |
+
- **Query Enhancement**: AI-powered search query improvement and intent recognition
|
| 328 |
+
- **Document Analysis**: Automated summary, classification, key points extraction, and quality scoring
|
| 329 |
+
- **Research Synthesis**: Intelligent combination of multiple sources into coherent insights
|
| 330 |
+
- **Content Classification**: Automatic categorization (academic, technical, code, general)
|
| 331 |
+
|
| 332 |
+
**Integration Points**:
|
| 333 |
+
- Direct API integration for real-time analysis
|
| 334 |
+
- Fallback mechanisms with mock embeddings for reliability
|
| 335 |
+
- Health monitoring and service availability checks
|
| 336 |
+
|
| 337 |
+
#### **β‘ Modal.com Platform** - Distributed Serverless Computing
|
| 338 |
+
**Purpose**: Heavy computational workloads and scalable AI processing
|
| 339 |
+
|
| 340 |
+
**Core Functions**:
|
| 341 |
+
- **Document Processing**: OCR text extraction from PDFs and images using PyPDF2 and Tesseract
|
| 342 |
+
- **Vector Operations**: High-performance FAISS index building and similarity search
|
| 343 |
+
- **Batch Processing**: Concurrent document processing with configurable memory (2-4GB) and CPU allocation
|
| 344 |
+
- **Persistent Storage**: Modal volumes for storing vector indices and metadata across sessions
|
| 345 |
+
- **Scalable APIs**: FastAPI endpoints for distributed compute tasks
|
| 346 |
+
|
| 347 |
+
**Available Endpoints**:
|
| 348 |
+
- `/vector-search` - High-performance semantic similarity search
|
| 349 |
+
- `/extract-text` - OCR and PDF text extraction
|
| 350 |
+
- `/build-index` - FAISS vector index creation and management
|
| 351 |
+
- `/batch-process` - Bulk document processing with configurable operations
|
| 352 |
+
- `/health` - Service monitoring and status verification
|
| 353 |
+
|
| 354 |
+
**Deployed Instance**: [https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run](https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run)
|
| 355 |
+
|
| 356 |
+
#### **π Integrated Workflow Architecture**
|
| 357 |
+
|
| 358 |
+
**Document Ingestion Pipeline**:
|
| 359 |
+
1. **Modal Processing**: OCR/PDF extraction β Text preprocessing
|
| 360 |
+
2. **Nebius Analysis** (Parallel): Classification β Summary β Quality assessment
|
| 361 |
+
3. **Vector Processing**: Nebius embeddings β Modal FAISS indexing
|
| 362 |
+
4. **Storage**: Local database + distributed index storage
|
| 363 |
+
|
| 364 |
+
**Enhanced Search Workflow**:
|
| 365 |
+
1. **Query Enhancement**: Nebius AI improves search queries
|
| 366 |
+
2. **Parallel Search**: Modal vector search + Local database + External sources
|
| 367 |
+
3. **AI Ranking**: Nebius scores and ranks results by relevance
|
| 368 |
+
4. **Synthesis**: Generate comprehensive insights from combined results
|
| 369 |
+
|
| 370 |
+
**Failover Strategy**:
|
| 371 |
+
- **Modal Unavailable**: Falls back to local search and basic processing
|
| 372 |
+
- **Nebius Unavailable**: Uses mock embeddings and simplified text analysis
|
| 373 |
+
- **Graceful Degradation**: Maintains core functionality with reduced AI capabilities
|
| 374 |
|
| 375 |
### **Data Flow**
|
| 376 |
1. User query β AI query enhancement (optional)
|
|
|
|
| 427 |
|
| 428 |
## π Related Resources
|
| 429 |
|
| 430 |
+
### **AI Services**
|
| 431 |
+
- [Nebius AI Documentation](https://docs.nebius.ai/) - Advanced language models and embeddings
|
| 432 |
+
- [Modal Documentation](https://modal.com/docs) - Serverless computing platform
|
| 433 |
+
- **Live Modal App**: [https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run](https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run)
|
| 434 |
+
- **Modal API Docs**: [https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run/docs](https://fazeelusmani18--knowledgebridge-main-fastapi-app.modal.run/docs)
|
| 435 |
+
|
| 436 |
+
### **Frontend Technologies**
|
| 437 |
- [React Query Documentation](https://tanstack.com/query/latest)
|
| 438 |
- [Radix UI Components](https://www.radix-ui.com/)
|
| 439 |
+
- [Tailwind CSS](https://tailwindcss.com/)
|
| 440 |
+
|
| 441 |
+
### **AI Models**
|
| 442 |
+
- [DeepSeek Models](https://platform.deepseek.com/) - Advanced reasoning capabilities
|
| 443 |
+
- [BAAI/bge-en-icl](https://huggingface.co/BAAI/bge-en-icl) - Embedding model for semantic search
|
| 444 |
|
| 445 |
---
|
| 446 |
|