Spaces:

MHamdan
/

SPARKNET

Sleeping

App Files Files Community

MHamdan commited on Jan 26

Commit

4718630

1 Parent(s): c1a790c

the update

Browse files

Files changed (9) hide show

.streamlit/secrets.toml.example +11 -0
DEPLOYMENT.md +375 -0
backend/__init__.py +2 -0
backend/api.py +720 -0
backend/requirements.txt +54 -0
demo/backend_client.py +315 -0
demo/rag_config.py +158 -7
demo/state_manager.py +27 -4
docs_connection.md +183 -0

.streamlit/secrets.toml.example CHANGED Viewed

@@ -14,6 +14,17 @@
 # Single user mode
 password = "your-secure-password"
 # Multi-user mode (uncomment to use):
 # [auth.users]
 # admin = "admin-password-here"

 # Single user mode
 password = "your-secure-password"
+# ============================================================================
+# Backend Server (Optional - for GPU processing)
+# ============================================================================
+# If you have a GPU server (e.g., Lytos), configure the backend URL here.
+# The backend provides GPU-accelerated OCR, embeddings, and RAG processing.
+# See DEPLOYMENT.md for setup instructions.
+# BACKEND_URL = "https://your-gpu-server.com:8000"
+# Or for local testing:
+# BACKEND_URL = "http://localhost:8000"
 # Multi-user mode (uncomment to use):
 # [auth.users]
 # admin = "admin-password-here"

DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,375 @@

+# SPARKNET Deployment Guide
+## Architecture Overview
+SPARKNET supports a hybrid deployment architecture:
+```
+┌─────────────────────────────┐         ┌─────────────────────────────┐
+│     Streamlit Cloud         │         │    GPU Server (Lytos)       │
+│     (Frontend/UI)           │  HTTPS  │    FastAPI Backend          │
+│                             │ ◄─────► │                             │
+│  sparknet.streamlit.app     │   API   │  - PaddleOCR (GPU)          │
+│                             │         │  - Document Processing       │
+│  - User Interface           │         │  - RAG + Embeddings          │
+│  - Authentication           │         │  - Ollama LLM                │
+│  - Cloud LLM fallback       │         │  - ChromaDB Vector Store     │
+└─────────────────────────────┘         └─────────────────────────────┘
+```
+## Deployment Options
+### Option 1: Full Stack on GPU Server (Recommended for Production)
+Run both frontend and backend on Lytos with GPU acceleration.
+### Option 2: Hybrid (Streamlit Cloud + GPU Backend)
+- **Frontend**: Streamlit Cloud (free hosting, easy sharing)
+- **Backend**: Lytos GPU server (full processing power)
+### Option 3: Streamlit Cloud Only (Demo Mode)
+- Uses cloud LLM providers (Groq, Gemini, etc.)
+- Limited functionality (no OCR, no RAG indexing)
+---
+## Option 2: Hybrid Deployment (Recommended)
+### Step 1: Setup Backend on Lytos (GPU Server)
+#### 1.1 SSH into Lytos
+```bash
+ssh user@lytos.server.address
+```
+#### 1.2 Clone the repository
+```bash
+git clone https://github.com/your-repo/sparknet.git
+cd sparknet
+```
+#### 1.3 Create virtual environment
+```bash
+python -m venv venv
+source venv/bin/activate
+```
+#### 1.4 Install backend dependencies
+```bash
+pip install -r backend/requirements.txt
+```
+#### 1.5 Install Ollama (for LLM inference)
+```bash
+curl -fsSL https://ollama.com/install.sh | sh
+# Pull required models
+ollama pull llama3.2:latest
+ollama pull nomic-embed-text
+```
+#### 1.6 Start the backend server
+```bash
+# Development mode
+cd backend
+uvicorn api:app --host 0.0.0.0 --port 8000 --reload
+# Production mode (with multiple workers)
+uvicorn api:app --host 0.0.0.0 --port 8000 --workers 4
+```
+#### 1.7 (Optional) Run with systemd for auto-restart
+```bash
+sudo nano /etc/systemd/system/sparknet-backend.service
+```
+Add:
+```ini
+[Unit]
+Description=SPARKNET Backend API
+After=network.target
+[Service]
+Type=simple
+User=your-user
+WorkingDirectory=/path/to/sparknet/backend
+Environment=PATH=/path/to/sparknet/venv/bin
+ExecStart=/path/to/sparknet/venv/bin/uvicorn api:app --host 0.0.0.0 --port 8000 --workers 4
+Restart=always
+RestartSec=10
+[Install]
+WantedBy=multi-user.target
+```
+Enable and start:
+```bash
+sudo systemctl enable sparknet-backend
+sudo systemctl start sparknet-backend
+```
+#### 1.8 Configure firewall (allow port 8000)
+```bash
+sudo ufw allow 8000/tcp
+```
+#### 1.9 (Optional) Setup HTTPS with nginx
+```bash
+sudo apt install nginx certbot python3-certbot-nginx
+sudo nano /etc/nginx/sites-available/sparknet
+```
+Add:
+```nginx
+server {
+    listen 80;
+    server_name api.sparknet.yourdomain.com;
+    location / {
+        proxy_pass http://127.0.0.1:8000;
+        proxy_http_version 1.1;
+        proxy_set_header Upgrade $http_upgrade;
+        proxy_set_header Connection 'upgrade';
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_cache_bypass $http_upgrade;
+        proxy_read_timeout 300s;
+        proxy_connect_timeout 75s;
+    }
+}
+```
+Enable and get SSL:
+```bash
+sudo ln -s /etc/nginx/sites-available/sparknet /etc/nginx/sites-enabled/
+sudo certbot --nginx -d api.sparknet.yourdomain.com
+sudo systemctl restart nginx
+```
+### Step 2: Configure Streamlit Cloud
+#### 2.1 Update Streamlit secrets
+In Streamlit Cloud dashboard → Settings → Secrets, add:
+```toml
+[auth]
+password = "SPARKNET@2026"
+# Backend URL (your Lytos server)
+BACKEND_URL = "https://api.sparknet.yourdomain.com"
+# Or without HTTPS:
+# BACKEND_URL = "http://lytos-ip-address:8000"
+# Fallback cloud providers (optional, used if backend unavailable)
+GROQ_API_KEY = "your-groq-key"
+GOOGLE_API_KEY = "your-google-key"
+```
+#### 2.2 Deploy to Streamlit Cloud
+Push your code and Streamlit Cloud will auto-deploy:
+```bash
+git add .
+git commit -m "Add backend support"
+git push origin main
+```
+### Step 3: Verify Deployment
+#### 3.1 Test backend directly
+```bash
+# Health check
+curl https://api.sparknet.yourdomain.com/api/health
+# System status
+curl https://api.sparknet.yourdomain.com/api/status
+```
+#### 3.2 Test from Streamlit
+Visit your Streamlit app and check:
+- Status bar should show "Backend" instead of "Demo Mode"
+- GPU indicator should appear
+- Document processing should use full pipeline
+---
+## Backend API Endpoints
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/api/health` | GET | Health check |
+| `/api/status` | GET | System status (Ollama, GPU, RAG) |
+| `/api/process` | POST | Process document (OCR, layout) |
+| `/api/index` | POST | Index document to RAG |
+| `/api/query` | POST | Query RAG system |
+| `/api/search` | POST | Search similar chunks |
+| `/api/documents` | GET | List indexed documents |
+| `/api/documents/{id}` | DELETE | Delete document |
+### API Documentation
+Once backend is running, visit:
+- Swagger UI: `http://lytos:8000/docs`
+- ReDoc: `http://lytos:8000/redoc`
+---
+## Environment Variables
+### Backend (Lytos)
+```bash
+# Optional: Configure Ollama host if not localhost
+export OLLAMA_HOST=http://localhost:11434
+# Optional: GPU device selection
+export CUDA_VISIBLE_DEVICES=0
+```
+### Frontend (Streamlit)
+Set in `secrets.toml` or Streamlit Cloud secrets:
+```toml
+# Required for hybrid mode
+BACKEND_URL = "https://your-backend-url"
+# Authentication
+[auth]
+password = "your-password"
+# Fallback cloud providers
+GROQ_API_KEY = "..."
+GOOGLE_API_KEY = "..."
+```
+---
+## Troubleshooting
+### Backend not reachable
+1. Check if backend is running:
+   ```bash
+   curl http://localhost:8000/api/health
+   ```
+2. Check firewall:
+   ```bash
+   sudo ufw status
+   ```
+3. Check nginx logs:
+   ```bash
+   sudo tail -f /var/log/nginx/error.log
+   ```
+### GPU not detected
+1. Check CUDA:
+   ```bash
+   nvidia-smi
+   python -c "import torch; print(torch.cuda.is_available())"
+   ```
+2. Check PaddlePaddle GPU:
+   ```bash
+   python -c "import paddle; print(paddle.device.is_compiled_with_cuda())"
+   ```
+### Ollama not working
+1. Check Ollama status:
+   ```bash
+   ollama list
+   curl http://localhost:11434/api/tags
+   ```
+2. Restart Ollama:
+   ```bash
+   sudo systemctl restart ollama
+   ```
+### Document processing fails
+1. Check backend logs:
+   ```bash
+   journalctl -u sparknet-backend -f
+   ```
+2. Test processing directly:
+   ```bash
+   curl -X POST http://localhost:8000/api/process \
+     -F "file=@test.pdf" \
+     -F "ocr_engine=paddleocr"
+   ```
+---
+## Security Considerations
+### Production Checklist
+- [ ] Enable HTTPS for backend API
+- [ ] Configure CORS properly (restrict origins)
+- [ ] Use strong authentication password
+- [ ] Enable rate limiting
+- [ ] Set up monitoring and alerts
+- [ ] Configure backup for ChromaDB data
+- [ ] Review GDPR compliance for data handling
+### CORS Configuration
+In `backend/api.py`, update for production:
+```python
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["https://sparknet.streamlit.app"],  # Your Streamlit URL
+    allow_credentials=True,
+    allow_methods=["GET", "POST", "DELETE"],
+    allow_headers=["*"],
+)
+```
+---
+## Performance Tuning
+### Backend Workers
+Adjust based on CPU cores:
+```bash
+uvicorn api:app --workers $(nproc)
+```
+### GPU Memory
+For large documents, monitor GPU memory:
+```bash
+watch -n 1 nvidia-smi
+```
+### ChromaDB Optimization
+For large document collections:
+```python
+store_config = VectorStoreConfig(
+    persist_directory="data/sparknet_unified_rag",
+    collection_name="sparknet_documents",
+    similarity_threshold=0.0,
+    # Add indexing options for better performance
+)
+```
+---
+## Contact & Support
+- **Project**: VISTA/Horizon EU
+- **Framework**: SPARKNET - Strategic Patent Acceleration & Research Kinetics NETwork
+- **Issues**: https://github.com/your-repo/sparknet/issues

backend/__init__.py ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ # SPARKNET Backend API
2	+ # GPU-accelerated document processing service

backend/api.py ADDED Viewed

	@@ -0,0 +1,720 @@

+"""
+SPARKNET Backend API - GPU-Accelerated Document Processing
+This FastAPI service runs on a GPU server (e.g., Lytos) and provides:
+- Document processing with PaddleOCR
+- Layout detection
+- RAG indexing and querying
+- Embedding generation
+- LLM inference via Ollama
+Deploy this on your GPU server and connect Streamlit Cloud to it.
+"""
+from fastapi import FastAPI, HTTPException, UploadFile, File, Form, BackgroundTasks
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel, Field
+from typing import Optional, List, Dict, Any
+import hashlib
+import tempfile
+import os
+import sys
+from pathlib import Path
+from datetime import datetime
+import asyncio
+# Add project root to path
+PROJECT_ROOT = Path(__file__).parent.parent
+sys.path.insert(0, str(PROJECT_ROOT))
+app = FastAPI(
+    title="SPARKNET Backend API",
+    description="GPU-accelerated document processing for Technology Transfer Office automation",
+    version="1.0.0",
+    docs_url="/docs",
+    redoc_url="/redoc",
+)
+# CORS - Allow Streamlit Cloud to connect
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # Configure specific origins in production
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# ============================================================================
+# Pydantic Models
+# ============================================================================
+class HealthResponse(BaseModel):
+    status: str
+    timestamp: str
+    version: str = "1.0.0"
+class SystemStatus(BaseModel):
+    ollama_available: bool
+    ollama_models: List[str] = []
+    gpu_available: bool = False
+    gpu_name: Optional[str] = None
+    rag_ready: bool = False
+    indexed_chunks: int = 0
+    embedding_model: Optional[str] = None
+    llm_model: Optional[str] = None
+class ProcessRequest(BaseModel):
+    filename: str
+    options: Dict[str, Any] = Field(default_factory=dict)
+class ProcessResponse(BaseModel):
+    success: bool
+    doc_id: str
+    filename: str
+    raw_text: str = ""
+    chunks: List[Dict[str, Any]] = []
+    page_count: int = 0
+    ocr_regions: List[Dict[str, Any]] = []
+    layout_regions: List[Dict[str, Any]] = []
+    ocr_confidence: float = 0.0
+    layout_confidence: float = 0.0
+    processing_time: float = 0.0
+    error: Optional[str] = None
+class IndexRequest(BaseModel):
+    doc_id: str
+    text: str
+    chunks: List[Dict[str, Any]] = []
+    metadata: Dict[str, Any] = Field(default_factory=dict)
+class IndexResponse(BaseModel):
+    success: bool
+    doc_id: str
+    num_chunks: int = 0
+    error: Optional[str] = None
+class QueryRequest(BaseModel):
+    question: str
+    filters: Optional[Dict[str, Any]] = None
+    top_k: int = 5
+class QueryResponse(BaseModel):
+    success: bool
+    answer: str = ""
+    sources: List[Dict[str, Any]] = []
+    confidence: float = 0.0
+    latency_ms: float = 0.0
+    validated: bool = False
+    error: Optional[str] = None
+class SearchRequest(BaseModel):
+    query: str
+    top_k: int = 5
+    doc_filter: Optional[str] = None
+class DocumentInfo(BaseModel):
+    doc_id: str
+    filename: str = ""
+    chunk_count: int = 0
+    indexed_at: Optional[str] = None
+# ============================================================================
+# Global State
+# ============================================================================
+_rag_system = None
+_processing_queue = {}
+def get_rag_system():
+    """Initialize and return the RAG system."""
+    global _rag_system
+    if _rag_system is not None:
+        return _rag_system
+    try:
+        from src.rag.agentic import AgenticRAG, RAGConfig
+        from src.rag.store import get_vector_store, VectorStoreConfig, reset_vector_store
+        from src.rag.embeddings import get_embedding_adapter, EmbeddingConfig, reset_embedding_adapter
+        # Check Ollama
+        ollama_ok, models = check_ollama_sync()
+        if not ollama_ok:
+            return None
+        # Select models
+        EMBEDDING_MODELS = ["nomic-embed-text", "mxbai-embed-large:latest", "mxbai-embed-large"]
+        LLM_MODELS = ["llama3.2:latest", "llama3.1:8b", "mistral:latest", "qwen2.5:14b"]
+        embed_model = next((m for m in EMBEDDING_MODELS if m in models), EMBEDDING_MODELS[0])
+        llm_model = next((m for m in LLM_MODELS if m in models), LLM_MODELS[0])
+        # Reset singletons
+        reset_vector_store()
+        reset_embedding_adapter()
+        # Initialize embedding adapter
+        embed_config = EmbeddingConfig(
+            ollama_model=embed_model,
+            ollama_base_url="http://localhost:11434",
+        )
+        embedder = get_embedding_adapter(config=embed_config)
+        # Initialize vector store
+        store_config = VectorStoreConfig(
+            persist_directory="data/sparknet_unified_rag",
+            collection_name="sparknet_documents",
+            similarity_threshold=0.0,
+        )
+        store = get_vector_store(config=store_config)
+        # Initialize RAG config
+        rag_config = RAGConfig(
+            model=llm_model,
+            base_url="http://localhost:11434",
+            max_revision_attempts=1,
+            enable_query_planning=True,
+            enable_reranking=True,
+            enable_validation=True,
+            retrieval_top_k=10,
+            final_top_k=5,
+            min_confidence=0.3,
+            verbose=False,
+        )
+        # Initialize RAG system
+        rag = AgenticRAG(
+            config=rag_config,
+            vector_store=store,
+            embedding_adapter=embedder,
+        )
+        _rag_system = {
+            "rag": rag,
+            "store": store,
+            "embedder": embedder,
+            "embed_model": embed_model,
+            "llm_model": llm_model,
+        }
+        return _rag_system
+    except Exception as e:
+        print(f"RAG init error: {e}")
+        return None
+def check_ollama_sync():
+    """Check Ollama availability synchronously."""
+    try:
+        import httpx
+        with httpx.Client(timeout=3.0) as client:
+            resp = client.get("http://localhost:11434/api/tags")
+            if resp.status_code == 200:
+                models = [m["name"] for m in resp.json().get("models", [])]
+                return True, models
+    except:
+        pass
+    return False, []
+def check_gpu():
+    """Check GPU availability."""
+    try:
+        import torch
+        if torch.cuda.is_available():
+            return True, torch.cuda.get_device_name(0)
+    except:
+        pass
+    return False, None
+# ============================================================================
+# API Endpoints
+# ============================================================================
+@app.get("/", response_model=HealthResponse)
+async def root():
+    """Health check endpoint."""
+    return HealthResponse(
+        status="healthy",
+        timestamp=datetime.now().isoformat(),
+    )
+@app.get("/api/health", response_model=HealthResponse)
+async def health():
+    """Health check endpoint."""
+    return HealthResponse(
+        status="healthy",
+        timestamp=datetime.now().isoformat(),
+    )
+@app.get("/api/status", response_model=SystemStatus)
+async def get_status():
+    """Get system status including Ollama, GPU, and RAG availability."""
+    ollama_ok, models = check_ollama_sync()
+    gpu_ok, gpu_name = check_gpu()
+    rag = get_rag_system()
+    rag_ready = rag is not None
+    indexed_chunks = 0
+    embed_model = None
+    llm_model = None
+    if rag:
+        try:
+            indexed_chunks = rag["store"].count()
+            embed_model = rag.get("embed_model")
+            llm_model = rag.get("llm_model")
+        except:
+            pass
+    return SystemStatus(
+        ollama_available=ollama_ok,
+        ollama_models=models,
+        gpu_available=gpu_ok,
+        gpu_name=gpu_name,
+        rag_ready=rag_ready,
+        indexed_chunks=indexed_chunks,
+        embedding_model=embed_model,
+        llm_model=llm_model,
+    )
+@app.post("/api/process", response_model=ProcessResponse)
+async def process_document(
+    file: UploadFile = File(...),
+    ocr_engine: str = Form(default="paddleocr"),
+    max_pages: int = Form(default=10),
+    enable_layout: bool = Form(default=True),
+    preserve_tables: bool = Form(default=True),
+):
+    """
+    Process a document with OCR and layout detection.
+    This endpoint uses GPU-accelerated PaddleOCR for text extraction.
+    """
+    import time
+    start_time = time.time()
+    # Read file
+    file_bytes = await file.read()
+    filename = file.filename
+    # Generate doc ID
+    content_hash = hashlib.md5(file_bytes[:1000]).hexdigest()[:8]
+    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
+    doc_id = hashlib.md5(f"{filename}_{timestamp}_{content_hash}".encode()).hexdigest()[:12]
+    # Save to temp file
+    suffix = Path(filename).suffix
+    with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as tmp:
+        tmp.write(file_bytes)
+        tmp_path = tmp.name
+    try:
+        # Try full document processing pipeline
+        try:
+            from src.document.pipeline.processor import DocumentProcessor, PipelineConfig
+            from src.document.ocr import OCRConfig
+            from src.document.layout import LayoutConfig
+            from src.document.chunking.chunker import ChunkerConfig
+            chunker_config = ChunkerConfig(
+                preserve_table_structure=preserve_tables,
+                detect_table_headers=True,
+                chunk_tables=True,
+                chunk_figures=True,
+                include_captions=True,
+            )
+            layout_config = LayoutConfig(
+                method="rule_based",
+                detect_tables=True,
+                detect_figures=True,
+                detect_headers=True,
+                detect_titles=True,
+                detect_lists=True,
+                min_confidence=0.3,
+                heading_font_ratio=1.1,
+            )
+            config = PipelineConfig(
+                ocr=OCRConfig(engine=ocr_engine),
+                layout=layout_config,
+                chunking=chunker_config,
+                max_pages=max_pages,
+                include_ocr_regions=True,
+                include_layout_regions=enable_layout,
+                generate_full_text=True,
+            )
+            processor = DocumentProcessor(config)
+            processor.initialize()
+            result = processor.process(tmp_path)
+            # Convert to response format
+            chunks_list = []
+            for chunk in result.chunks:
+                chunks_list.append({
+                    "chunk_id": chunk.chunk_id,
+                    "text": chunk.text,
+                    "page": chunk.page,
+                    "chunk_type": chunk.chunk_type.value,
+                    "confidence": chunk.confidence,
+                    "bbox": chunk.bbox.to_xyxy() if chunk.bbox else None,
+                })
+            ocr_regions = []
+            for region in result.ocr_regions:
+                ocr_regions.append({
+                    "text": region.text,
+                    "confidence": region.confidence,
+                    "page": region.page,
+                    "bbox": region.bbox.to_xyxy() if region.bbox else None,
+                })
+            layout_regions = []
+            for region in result.layout_regions:
+                layout_regions.append({
+                    "id": region.id,
+                    "type": region.type.value,
+                    "confidence": region.confidence,
+                    "page": region.page,
+                    "bbox": region.bbox.to_xyxy() if region.bbox else None,
+                })
+            processing_time = time.time() - start_time
+            return ProcessResponse(
+                success=True,
+                doc_id=doc_id,
+                filename=filename,
+                raw_text=result.full_text,
+                chunks=chunks_list,
+                page_count=result.metadata.num_pages,
+                ocr_regions=ocr_regions,
+                layout_regions=layout_regions,
+                ocr_confidence=result.metadata.ocr_confidence_avg or 0.0,
+                layout_confidence=result.metadata.layout_confidence_avg or 0.0,
+                processing_time=processing_time,
+            )
+        except Exception as e:
+            # Fallback to simple extraction
+            return await process_document_fallback(file_bytes, filename, doc_id, max_pages, str(e), start_time)
+    finally:
+        # Cleanup
+        if os.path.exists(tmp_path):
+            os.unlink(tmp_path)
+async def process_document_fallback(
+    file_bytes: bytes,
+    filename: str,
+    doc_id: str,
+    max_pages: int,
+    reason: str,
+    start_time: float
+) -> ProcessResponse:
+    """Fallback document processing using PyMuPDF."""
+    import time
+    text = ""
+    page_count = 1
+    suffix = Path(filename).suffix.lower()
+    if suffix == ".pdf":
+        try:
+            import fitz
+            import io
+            pdf_stream = io.BytesIO(file_bytes)
+            doc = fitz.open(stream=pdf_stream, filetype="pdf")
+            page_count = len(doc)
+            max_p = min(max_pages, page_count)
+            text_parts = []
+            for page_num in range(max_p):
+                page = doc[page_num]
+                text_parts.append(f"--- Page {page_num + 1} ---\n{page.get_text()}")
+            text = "\n\n".join(text_parts)
+            doc.close()
+        except Exception as e:
+            text = f"PDF extraction failed: {e}"
+    elif suffix in [".txt", ".md"]:
+        try:
+            text = file_bytes.decode("utf-8")
+        except:
+            text = file_bytes.decode("latin-1", errors="ignore")
+    else:
+        text = f"Unsupported file type: {suffix}"
+    # Simple chunking
+    chunk_size = 500
+    overlap = 50
+    chunks = []
+    for i in range(0, len(text), chunk_size - overlap):
+        chunk_text = text[i:i + chunk_size]
+        if len(chunk_text.strip()) > 20:
+            chunks.append({
+                "chunk_id": f"{doc_id}_chunk_{len(chunks)}",
+                "text": chunk_text,
+                "page": 0,
+                "chunk_type": "text",
+                "confidence": 0.9,
+                "bbox": None,
+            })
+    processing_time = time.time() - start_time
+    return ProcessResponse(
+        success=True,
+        doc_id=doc_id,
+        filename=filename,
+        raw_text=text,
+        chunks=chunks,
+        page_count=page_count,
+        ocr_regions=[],
+        layout_regions=[],
+        ocr_confidence=0.9,
+        layout_confidence=0.0,
+        processing_time=processing_time,
+        error=f"Fallback mode: {reason}",
+    )
+@app.post("/api/index", response_model=IndexResponse)
+async def index_document(request: IndexRequest):
+    """Index a document into the RAG vector store."""
+    rag = get_rag_system()
+    if not rag:
+        return IndexResponse(
+            success=False,
+            doc_id=request.doc_id,
+            error="RAG system not available. Check Ollama status.",
+        )
+    try:
+        store = rag["store"]
+        embedder = rag["embedder"]
+        chunk_dicts = []
+        embeddings = []
+        for i, chunk in enumerate(request.chunks):
+            chunk_text = chunk.get("text", "") if isinstance(chunk, dict) else str(chunk)
+            if len(chunk_text.strip()) < 20:
+                continue
+            chunk_id = chunk.get("chunk_id", f"{request.doc_id}_chunk_{i}")
+            chunk_dict = {
+                "chunk_id": chunk_id,
+                "document_id": request.doc_id,
+                "text": chunk_text,
+                "page": chunk.get("page", 0) if isinstance(chunk, dict) else 0,
+                "chunk_type": "text",
+                "source_path": request.metadata.get("filename", ""),
+                "sequence_index": i,
+            }
+            chunk_dicts.append(chunk_dict)
+            embedding = embedder.embed_text(chunk_text)
+            embeddings.append(embedding)
+        if not chunk_dicts:
+            return IndexResponse(
+                success=False,
+                doc_id=request.doc_id,
+                error="No valid chunks to index",
+            )
+        store.add_chunks(chunk_dicts, embeddings)
+        return IndexResponse(
+            success=True,
+            doc_id=request.doc_id,
+            num_chunks=len(chunk_dicts),
+        )
+    except Exception as e:
+        return IndexResponse(
+            success=False,
+            doc_id=request.doc_id,
+            error=str(e),
+        )
+@app.post("/api/query", response_model=QueryResponse)
+async def query_rag(request: QueryRequest):
+    """Query the RAG system."""
+    import time
+    start_time = time.time()
+    rag = get_rag_system()
+    if not rag:
+        return QueryResponse(
+            success=False,
+            error="RAG system not available. Check Ollama status.",
+        )
+    try:
+        response = rag["rag"].query(request.question, filters=request.filters)
+        latency_ms = (time.time() - start_time) * 1000
+        sources = []
+        if hasattr(response, 'citations') and response.citations:
+            for cite in response.citations:
+                sources.append({
+                    "index": cite.index if hasattr(cite, 'index') else 0,
+                    "text_snippet": cite.text_snippet if hasattr(cite, 'text_snippet') else str(cite),
+                    "relevance_score": cite.relevance_score if hasattr(cite, 'relevance_score') else 0.0,
+                    "document_id": cite.document_id if hasattr(cite, 'document_id') else "",
+                    "page": cite.page if hasattr(cite, 'page') else 0,
+                })
+        return QueryResponse(
+            success=True,
+            answer=response.answer,
+            sources=sources,
+            confidence=response.confidence,
+            latency_ms=latency_ms,
+            validated=response.validated,
+        )
+    except Exception as e:
+        return QueryResponse(
+            success=False,
+            error=str(e),
+        )
+@app.post("/api/search")
+async def search_similar(request: SearchRequest):
+    """Search for similar chunks."""
+    rag = get_rag_system()
+    if not rag:
+        return {"success": False, "error": "RAG system not available", "results": []}
+    try:
+        embedder = rag["embedder"]
+        store = rag["store"]
+        query_embedding = embedder.embed_text(request.query)
+        filters = None
+        if request.doc_filter:
+            filters = {"document_id": request.doc_filter}
+        results = store.search(
+            query_embedding=query_embedding,
+            top_k=request.top_k,
+            filters=filters,
+        )
+        return {
+            "success": True,
+            "results": [
+                {
+                    "chunk_id": r.chunk_id,
+                    "document_id": r.document_id,
+                    "text": r.text,
+                    "similarity": r.similarity,
+                    "page": r.page,
+                    "metadata": r.metadata,
+                }
+                for r in results
+            ]
+        }
+    except Exception as e:
+        return {"success": False, "error": str(e), "results": []}
+@app.get("/api/documents", response_model=List[DocumentInfo])
+async def list_documents():
+    """List all indexed documents."""
+    rag = get_rag_system()
+    if not rag:
+        return []
+    try:
+        store = rag["store"]
+        collection = store._collection
+        results = collection.get(include=["metadatas"])
+        if not results or not results.get("metadatas"):
+            return []
+        doc_info = {}
+        for meta in results["metadatas"]:
+            doc_id = meta.get("document_id", "unknown")
+            if doc_id not in doc_info:
+                doc_info[doc_id] = {
+                    "doc_id": doc_id,
+                    "filename": meta.get("source_path", ""),
+                    "chunk_count": 0,
+                }
+            doc_info[doc_id]["chunk_count"] += 1
+        return [DocumentInfo(**info) for info in doc_info.values()]
+    except Exception as e:
+        return []
+@app.delete("/api/documents/{doc_id}")
+async def delete_document(doc_id: str):
+    """Delete a document from the index."""
+    rag = get_rag_system()
+    if not rag:
+        return {"success": False, "error": "RAG system not available"}
+    try:
+        store = rag["store"]
+        collection = store._collection
+        # Get chunk IDs for this document
+        results = collection.get(
+            where={"document_id": doc_id},
+            include=[]
+        )
+        if results and results.get("ids"):
+            collection.delete(ids=results["ids"])
+            return {"success": True, "deleted_chunks": len(results["ids"])}
+        return {"success": False, "error": "Document not found"}
+    except Exception as e:
+        return {"success": False, "error": str(e)}
+# ============================================================================
+# Run Server
+# ============================================================================
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)

backend/requirements.txt ADDED Viewed

	@@ -0,0 +1,54 @@

+# SPARKNET Backend Requirements
+# For GPU server (Lytos) deployment
+# ==============================================================================
+# API Framework
+# ==============================================================================
+fastapi>=0.104.0
+uvicorn[standard]>=0.24.0
+python-multipart>=0.0.6
+# ==============================================================================
+# Document Processing (GPU-accelerated)
+# ==============================================================================
+paddleocr>=2.7.0
+paddlepaddle-gpu>=2.5.0  # Use paddlepaddle for CPU-only
+# ==============================================================================
+# PDF Processing
+# ==============================================================================
+pymupdf>=1.23.0
+# ==============================================================================
+# Vector Store & Embeddings
+# ==============================================================================
+chromadb>=0.4.0
+sentence-transformers>=2.2.0
+# ==============================================================================
+# LangChain & LLM
+# ==============================================================================
+langchain>=0.1.0
+langchain-community>=0.0.20
+langchain-ollama>=0.0.1
+ollama>=0.1.0
+# ==============================================================================
+# Data Handling
+# ==============================================================================
+pydantic>=2.0.0
+pydantic-settings>=2.0.0
+numpy>=1.24.0
+httpx>=0.25.0
+# ==============================================================================
+# ML/Deep Learning
+# ==============================================================================
+torch>=2.0.0
+torchvision>=0.15.0
+# ==============================================================================
+# Utilities
+# ==============================================================================
+loguru>=0.7.0
+python-dotenv>=1.0.0

demo/backend_client.py ADDED Viewed

	@@ -0,0 +1,315 @@

+"""
+SPARKNET Backend Client
+Client for connecting Streamlit Cloud to the GPU backend server (Lytos).
+Handles all API communication with the FastAPI backend.
+"""
+import httpx
+import streamlit as st
+from typing import Optional, Dict, Any, List, Tuple
+from dataclasses import dataclass
+import os
+def get_backend_url() -> Optional[str]:
+    """Get backend URL from secrets or environment."""
+    # Try Streamlit secrets first
+    try:
+        if hasattr(st, 'secrets'):
+            if "BACKEND_URL" in st.secrets:
+                return st.secrets["BACKEND_URL"]
+            if "backend" in st.secrets and "url" in st.secrets["backend"]:
+                return st.secrets["backend"]["url"]
+    except:
+        pass
+    # Fall back to environment
+    return os.environ.get("SPARKNET_BACKEND_URL")
+def is_backend_configured() -> bool:
+    """Check if backend is configured."""
+    return get_backend_url() is not None
+@dataclass
+class BackendResponse:
+    """Generic backend response wrapper."""
+    success: bool
+    data: Dict[str, Any]
+    error: Optional[str] = None
+class BackendClient:
+    """
+    Client for SPARKNET Backend API.
+    Provides methods to:
+    - Check backend health and status
+    - Process documents (OCR, layout detection)
+    - Index documents to RAG
+    - Query RAG system
+    - Search similar chunks
+    """
+    def __init__(self, base_url: Optional[str] = None, timeout: float = 120.0):
+        self.base_url = base_url or get_backend_url()
+        self.timeout = timeout
+        self._client = None
+    @property
+    def is_configured(self) -> bool:
+        return self.base_url is not None
+    def _get_client(self) -> httpx.Client:
+        if self._client is None:
+            self._client = httpx.Client(
+                base_url=self.base_url,
+                timeout=self.timeout,
+            )
+        return self._client
+    def close(self):
+        if self._client:
+            self._client.close()
+            self._client = None
+    def health_check(self) -> BackendResponse:
+        """Check if backend is healthy."""
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            resp = client.get("/api/health")
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def get_status(self) -> BackendResponse:
+        """Get backend system status."""
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            resp = client.get("/api/status")
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def process_document(
+        self,
+        file_bytes: bytes,
+        filename: str,
+        ocr_engine: str = "paddleocr",
+        max_pages: int = 10,
+        enable_layout: bool = True,
+        preserve_tables: bool = True,
+    ) -> BackendResponse:
+        """
+        Process a document using the backend.
+        Args:
+            file_bytes: Document content as bytes
+            filename: Original filename
+            ocr_engine: OCR engine to use (paddleocr, tesseract)
+            max_pages: Maximum pages to process
+            enable_layout: Enable layout detection
+            preserve_tables: Preserve table structure
+        Returns:
+            BackendResponse with processing results
+        """
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            files = {"file": (filename, file_bytes)}
+            data = {
+                "ocr_engine": ocr_engine,
+                "max_pages": str(max_pages),
+                "enable_layout": str(enable_layout).lower(),
+                "preserve_tables": str(preserve_tables).lower(),
+            }
+            resp = client.post("/api/process", files=files, data=data)
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def index_document(
+        self,
+        doc_id: str,
+        text: str,
+        chunks: List[Dict[str, Any]],
+        metadata: Optional[Dict[str, Any]] = None,
+    ) -> BackendResponse:
+        """
+        Index a document into the RAG system.
+        Args:
+            doc_id: Document identifier
+            text: Full document text
+            chunks: List of chunk dictionaries
+            metadata: Optional metadata
+        Returns:
+            BackendResponse with indexing results
+        """
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            payload = {
+                "doc_id": doc_id,
+                "text": text,
+                "chunks": chunks,
+                "metadata": metadata or {},
+            }
+            resp = client.post("/api/index", json=payload)
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def query(
+        self,
+        question: str,
+        filters: Optional[Dict[str, Any]] = None,
+        top_k: int = 5,
+    ) -> BackendResponse:
+        """
+        Query the RAG system.
+        Args:
+            question: Query question
+            filters: Optional filters (e.g., document_id)
+            top_k: Number of results
+        Returns:
+            BackendResponse with answer and sources
+        """
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            payload = {
+                "question": question,
+                "filters": filters,
+                "top_k": top_k,
+            }
+            resp = client.post("/api/query", json=payload)
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def search_similar(
+        self,
+        query: str,
+        top_k: int = 5,
+        doc_filter: Optional[str] = None,
+    ) -> BackendResponse:
+        """
+        Search for similar chunks.
+        Args:
+            query: Search query
+            top_k: Number of results
+            doc_filter: Optional document ID filter
+        Returns:
+            BackendResponse with similar chunks
+        """
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            payload = {
+                "query": query,
+                "top_k": top_k,
+                "doc_filter": doc_filter,
+            }
+            resp = client.post("/api/search", json=payload)
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def list_documents(self) -> BackendResponse:
+        """List all indexed documents."""
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            resp = client.get("/api/documents")
+            resp.raise_for_status()
+            return BackendResponse(True, {"documents": resp.json()})
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+    def delete_document(self, doc_id: str) -> BackendResponse:
+        """Delete a document from the index."""
+        if not self.is_configured:
+            return BackendResponse(False, {}, "Backend URL not configured")
+        try:
+            client = self._get_client()
+            resp = client.delete(f"/api/documents/{doc_id}")
+            resp.raise_for_status()
+            return BackendResponse(True, resp.json())
+        except Exception as e:
+            return BackendResponse(False, {}, str(e))
+# Global client instance
+_backend_client: Optional[BackendClient] = None
+def get_backend_client() -> BackendClient:
+    """Get or create the backend client."""
+    global _backend_client
+    if _backend_client is None:
+        _backend_client = BackendClient()
+    return _backend_client
+def check_backend_available() -> Tuple[bool, Dict[str, Any]]:
+    """
+    Check if backend is available and return status.
+    Returns:
+        Tuple of (available, status_dict)
+    """
+    client = get_backend_client()
+    if not client.is_configured:
+        return False, {"error": "Backend URL not configured"}
+    # Health check
+    health = client.health_check()
+    if not health.success:
+        return False, {"error": f"Backend not reachable: {health.error}"}
+    # Get full status
+    status = client.get_status()
+    if not status.success:
+        return False, {"error": f"Failed to get status: {status.error}"}
+    return True, status.data

demo/rag_config.py CHANGED Viewed

@@ -4,9 +4,10 @@ Unified RAG Configuration for SPARKNET Demo
 This module provides a single source of truth for RAG system configuration,
 ensuring all demo pages use the same vector store, embeddings, and models.
-Supports both:
-1. Local Ollama (for on-premise deployments)
-2. Cloud LLM providers (for Streamlit Cloud)
 """
 import streamlit as st
@@ -79,13 +80,30 @@ def check_cloud_providers():
     return providers
 @st.cache_resource
 def get_unified_rag_system():
     """
     Initialize and return the unified RAG system.
     This is cached at the Streamlit level so all pages share the same instance.
-    Supports both Ollama (local) and cloud providers (Streamlit Cloud).
     """
     # Check for required dependencies first
     try:
@@ -100,6 +118,25 @@ def get_unified_rag_system():
             "mode": "error",
         }
     # Check Ollama availability
     ollama_ok, available_models = check_ollama()
@@ -210,11 +247,23 @@ def get_store_stats():
     """Get current vector store statistics."""
     system = get_unified_rag_system()
     if system["mode"] == "cloud":
         return {
             "total_chunks": 0,
             "status": "cloud",
-            "message": "Cloud mode - indexing requires Ollama",
         }
     if system["status"] != "ready":
@@ -235,8 +284,33 @@ def index_document(text: str, document_id: str, metadata: dict = None) -> dict:
     """Index a document into the unified RAG system."""
     system = get_unified_rag_system()
     if system["mode"] == "cloud":
-        return {"success": False, "error": "Indexing requires Ollama", "num_chunks": 0}
     if system["status"] != "ready":
         return {"success": False, "error": system.get("error", "RAG not ready"), "num_chunks": 0}
@@ -256,6 +330,36 @@ def query_rag(question: str, filters: dict = None):
     """Query the unified RAG system."""
     system = get_unified_rag_system()
     if system["mode"] == "cloud":
         # Use cloud LLM for Q&A
         from llm_providers import generate_response
@@ -283,6 +387,27 @@ def clear_index():
 def get_indexed_documents() -> list:
     """Get list of indexed document IDs from vector store."""
     system = get_unified_rag_system()
     if system["status"] != "ready":
         return []
@@ -344,6 +469,19 @@ def get_chunks_for_document(document_id: str) -> list:
 def search_similar_chunks(query: str, top_k: int = 5, doc_filter: str = None):
     """Search for similar chunks with optional document filter."""
     system = get_unified_rag_system()
     if system["status"] != "ready":
         return []
@@ -430,8 +568,21 @@ def auto_index_processed_document(doc_id: str, text: str, chunks: list, metadata
     """
     system = get_unified_rag_system()
     if system["mode"] == "cloud":
-        return {"success": False, "error": "Indexing requires Ollama", "num_chunks": 0}
     if system["status"] != "ready":
         return {"success": False, "error": "RAG system not ready", "num_chunks": 0}

 This module provides a single source of truth for RAG system configuration,
 ensuring all demo pages use the same vector store, embeddings, and models.
+Supports three deployment modes:
+1. Backend API (GPU server like Lytos) - Full processing power
+2. Local Ollama (for on-premise deployments)
+3. Cloud LLM providers (for Streamlit Cloud without backend)
 """
 import streamlit as st
     return providers
+def check_backend():
+    """Check if backend API is available."""
+    try:
+        from backend_client import check_backend_available, get_backend_url
+        if get_backend_url():
+            available, status = check_backend_available()
+            return available, status
+    except:
+        pass
+    return False, {}
 @st.cache_resource
 def get_unified_rag_system():
     """
     Initialize and return the unified RAG system.
     This is cached at the Streamlit level so all pages share the same instance.
+    Priority:
+    1. Backend API (GPU server) - if BACKEND_URL is configured
+    2. Local Ollama - if running locally
+    3. Cloud LLM providers - if API keys configured
+    4. Demo mode - no backend available
     """
     # Check for required dependencies first
     try:
             "mode": "error",
         }
+    # Check backend API first (GPU server)
+    backend_ok, backend_status = check_backend()
+    if backend_ok:
+        return {
+            "status": "ready",
+            "error": None,
+            "rag": None,  # Use backend API instead
+            "store": None,
+            "embedder": None,
+            "mode": "backend",
+            "backend_status": backend_status,
+            "ollama_available": backend_status.get("ollama_available", False),
+            "gpu_available": backend_status.get("gpu_available", False),
+            "gpu_name": backend_status.get("gpu_name"),
+            "embed_model": backend_status.get("embedding_model", "backend"),
+            "llm_model": backend_status.get("llm_model", "backend"),
+            "indexed_chunks": backend_status.get("indexed_chunks", 0),
+        }
     # Check Ollama availability
     ollama_ok, available_models = check_ollama()
     """Get current vector store statistics."""
     system = get_unified_rag_system()
+    # Use backend status if available
+    if system["mode"] == "backend":
+        return {
+            "total_chunks": system.get("indexed_chunks", 0),
+            "status": "ready",
+            "mode": "backend",
+            "embed_model": system.get("embed_model", "backend"),
+            "llm_model": system.get("llm_model", "backend"),
+            "gpu_available": system.get("gpu_available", False),
+            "gpu_name": system.get("gpu_name"),
+        }
     if system["mode"] == "cloud":
         return {
             "total_chunks": 0,
             "status": "cloud",
+            "message": "Cloud mode - indexing requires backend or Ollama",
         }
     if system["status"] != "ready":
     """Index a document into the unified RAG system."""
     system = get_unified_rag_system()
+    # Use backend API if available
+    if system["mode"] == "backend":
+        try:
+            from backend_client import get_backend_client
+            client = get_backend_client()
+            # Simple chunking for backend indexing
+            chunk_size = 500
+            overlap = 50
+            chunks = []
+            for i in range(0, len(text), chunk_size - overlap):
+                chunk_text = text[i:i + chunk_size]
+                if len(chunk_text.strip()) > 20:
+                    chunks.append({
+                        "chunk_id": f"{document_id}_chunk_{len(chunks)}",
+                        "text": chunk_text,
+                        "page": 0,
+                    })
+            result = client.index_document(document_id, text, chunks, metadata)
+            if result.success:
+                return {"success": True, "num_chunks": result.data.get("num_chunks", 0), "error": None}
+            else:
+                return {"success": False, "error": result.error, "num_chunks": 0}
+        except Exception as e:
+            return {"success": False, "error": str(e), "num_chunks": 0}
     if system["mode"] == "cloud":
+        return {"success": False, "error": "Indexing requires backend or Ollama", "num_chunks": 0}
     if system["status"] != "ready":
         return {"success": False, "error": system.get("error", "RAG not ready"), "num_chunks": 0}
     """Query the unified RAG system."""
     system = get_unified_rag_system()
+    # Use backend API if available
+    if system["mode"] == "backend":
+        try:
+            from backend_client import get_backend_client
+            client = get_backend_client()
+            result = client.query(question, filters=filters)
+            if result.success:
+                data = result.data
+                # Create a response object-like dict
+                return type('RAGResponse', (), {
+                    'answer': data.get('answer', ''),
+                    'citations': [
+                        type('Citation', (), {
+                            'index': s.get('index', i+1),
+                            'text_snippet': s.get('text_snippet', ''),
+                            'relevance_score': s.get('relevance_score', 0),
+                            'document_id': s.get('document_id', ''),
+                            'page': s.get('page', 0),
+                        })() for i, s in enumerate(data.get('sources', []))
+                    ],
+                    'confidence': data.get('confidence', 0),
+                    'latency_ms': data.get('latency_ms', 0),
+                    'num_sources': len(data.get('sources', [])),
+                    'validated': data.get('validated', False),
+                })(), None
+            else:
+                return None, result.error
+        except Exception as e:
+            return None, str(e)
     if system["mode"] == "cloud":
         # Use cloud LLM for Q&A
         from llm_providers import generate_response
 def get_indexed_documents() -> list:
     """Get list of indexed document IDs from vector store."""
     system = get_unified_rag_system()
+    # Use backend API if available
+    if system["mode"] == "backend":
+        try:
+            from backend_client import get_backend_client
+            client = get_backend_client()
+            result = client.list_documents()
+            if result.success:
+                docs = result.data.get("documents", [])
+                return [
+                    {
+                        "document_id": d.get("doc_id", d.get("document_id", "")),
+                        "source_path": d.get("filename", ""),
+                        "chunk_count": d.get("chunk_count", 0),
+                    }
+                    for d in docs
+                ]
+        except:
+            pass
+        return []
     if system["status"] != "ready":
         return []
 def search_similar_chunks(query: str, top_k: int = 5, doc_filter: str = None):
     """Search for similar chunks with optional document filter."""
     system = get_unified_rag_system()
+    # Use backend API if available
+    if system["mode"] == "backend":
+        try:
+            from backend_client import get_backend_client
+            client = get_backend_client()
+            result = client.search_similar(query, top_k, doc_filter)
+            if result.success:
+                return result.data.get("results", [])
+        except:
+            pass
+        return []
     if system["status"] != "ready":
         return []
     """
     system = get_unified_rag_system()
+    # Use backend API if available
+    if system["mode"] == "backend":
+        try:
+            from backend_client import get_backend_client
+            client = get_backend_client()
+            result = client.index_document(doc_id, text, chunks, metadata)
+            if result.success:
+                return {"success": True, "num_chunks": result.data.get("num_chunks", 0), "error": None}
+            else:
+                return {"success": False, "error": result.error, "num_chunks": 0}
+        except Exception as e:
+            return {"success": False, "error": str(e), "num_chunks": 0}
     if system["mode"] == "cloud":
+        return {"success": False, "error": "Indexing requires backend or Ollama", "num_chunks": 0}
     if system["status"] != "ready":
         return {"success": False, "error": "RAG system not ready", "num_chunks": 0}

demo/state_manager.py CHANGED Viewed

@@ -661,6 +661,8 @@ def render_global_status_bar():
         rag_status = rag_system["status"]
         rag_mode = rag_system.get("mode", "error")
         llm_model = rag_system.get("llm_model", "N/A")
     except:
         ollama_ok = False
         cloud_providers = {}
@@ -668,12 +670,19 @@ def render_global_status_bar():
         rag_mode = "error"
         llm_model = "N/A"
         models = []
     # Status bar
     cols = st.columns(6)
     with cols[0]:
-        if ollama_ok:
             st.success(f"Ollama ({len(models)})")
         elif cloud_providers:
             st.info(f"Cloud ({len(cloud_providers)})")
@@ -682,7 +691,10 @@ def render_global_status_bar():
     with cols[1]:
         if rag_status == "ready":
-            st.success("RAG Ready")
         elif rag_mode == "cloud":
             st.info("Cloud LLM")
         elif rag_mode == "demo":
@@ -691,7 +703,12 @@ def render_global_status_bar():
             st.error("RAG Error")
     with cols[2]:
-        if rag_mode == "cloud" and cloud_providers:
             provider_name = list(cloud_providers.keys())[0].title()
             st.info(f"{provider_name}")
         elif llm_model != "N/A":
@@ -703,7 +720,13 @@ def render_global_status_bar():
         st.info(f"{summary['total_documents']} Docs")
     with cols[4]:
-        if summary['indexed_documents'] > 0:
             st.success(f"{summary['total_indexed_chunks']} Chunks")
         else:
             st.warning("0 Chunks")

         rag_status = rag_system["status"]
         rag_mode = rag_system.get("mode", "error")
         llm_model = rag_system.get("llm_model", "N/A")
+        gpu_available = rag_system.get("gpu_available", False)
+        gpu_name = rag_system.get("gpu_name", "")
     except:
         ollama_ok = False
         cloud_providers = {}
         rag_mode = "error"
         llm_model = "N/A"
         models = []
+        gpu_available = False
+        gpu_name = ""
     # Status bar
     cols = st.columns(6)
     with cols[0]:
+        if rag_mode == "backend":
+            if gpu_available:
+                st.success(f"GPU Backend")
+            else:
+                st.success("Backend")
+        elif ollama_ok:
             st.success(f"Ollama ({len(models)})")
         elif cloud_providers:
             st.info(f"Cloud ({len(cloud_providers)})")
     with cols[1]:
         if rag_status == "ready":
+            if rag_mode == "backend":
+                st.success("RAG (Backend)")
+            else:
+                st.success("RAG Ready")
         elif rag_mode == "cloud":
             st.info("Cloud LLM")
         elif rag_mode == "demo":
             st.error("RAG Error")
     with cols[2]:
+        if rag_mode == "backend":
+            if gpu_name:
+                st.info(f"{gpu_name[:12]}")
+            else:
+                st.info(f"{llm_model.split(':')[0] if llm_model else 'Backend'}")
+        elif rag_mode == "cloud" and cloud_providers:
             provider_name = list(cloud_providers.keys())[0].title()
             st.info(f"{provider_name}")
         elif llm_model != "N/A":
         st.info(f"{summary['total_documents']} Docs")
     with cols[4]:
+        if rag_mode == "backend":
+            indexed = rag_system.get("indexed_chunks", 0)
+            if indexed > 0:
+                st.success(f"{indexed} Chunks")
+            else:
+                st.info("0 Chunks")
+        elif summary['indexed_documents'] > 0:
             st.success(f"{summary['total_indexed_chunks']} Chunks")
         else:
             st.warning("0 Chunks")

docs_connection.md ADDED Viewed

	@@ -0,0 +1,183 @@

+# SPARKNET Deployment Architecture
+## Quick Answer
+**For Streamlit Cloud:** Push to **GitHub only** (`git push origin main`), then reboot the app.
+**For Hugging Face Spaces:** Push to **Hugging Face only** (`git push hf main`).
+They are **independent deployments** - you choose which platform to use.
+---
+## Architecture Overview
+```
++------------------+          +-------------------+
+|    Your Code     |          |   Lytos Server    |
+|   (Local/Git)    |          |   172.24.50.21    |
++--------+---------+          +---------+---------+
+         |                              |
+         |                              | Backend API
+    +----+----+                         | (port 8000)
+    |         |                         |
+    v         v                         v
++-------+  +--------+           +---------------+
+|GitHub |  |Hugging |           | localtunnel   |
+|       |  |Face    |           | (public URL)  |
++---+---+  +---+----+           +-------+-------+
+    |          |                        |
+    |          |                        |
+    v          v                        |
++----------+  +-----------+             |
+|Streamlit |  |HF Spaces  |<------------+
+|Cloud     |  |           |   Backend calls
++----------+  +-----------+
+```
+---
+## Platform Comparison
+| Feature | Streamlit Cloud | Hugging Face Spaces |
+|---------|-----------------|---------------------|
+| **Source** | GitHub repo | HF repo (or GitHub) |
+| **Push command** | `git push origin main` | `git push hf main` |
+| **Auto-rebuild** | Yes (on push) | Yes (on push) |
+| **Secrets** | Dashboard > Settings > Secrets | Settings > Variables |
+| **Free tier** | Yes (limited resources) | Yes (limited resources) |
+| **Custom domain** | Premium only | Premium only |
+| **GPU support** | No | Yes (paid) |
+---
+## Your Current Setup
+### Git Remotes
+```bash
+origin  -> github.com:MHHamdan/SPARKNET.git     # For Streamlit Cloud
+hf      -> hf.co:spaces/mhamdan/SPARKNET.git   # For Hugging Face Spaces
+```
+### Deployment URLs
+- **Streamlit Cloud:** `https://mhhamdan-sparknet.streamlit.app`
+- **Hugging Face:** `https://huggingface.co/spaces/mhamdan/SPARKNET`
+### Backend (Lytos GPU Server)
+- **Internal:** `http://172.24.50.21:8000`
+- **Public (via tunnel):** `https://selfish-crab-86.loca.lt`
+---
+## How to Deploy
+### Option 1: Streamlit Cloud (Recommended)
+```bash
+# 1. Make changes locally
+# 2. Commit
+git add .
+git commit -m "Your message"
+# 3. Push to GitHub
+git push origin main
+# 4. Streamlit Cloud auto-rebuilds (or manually reboot in dashboard)
+```
+**Secrets location:** https://share.streamlit.io > Your App > Settings > Secrets
+### Option 2: Hugging Face Spaces
+```bash
+# 1. Make changes locally
+# 2. Commit
+git add .
+git commit -m "Your message"
+# 3. Push to Hugging Face
+git push hf main
+# 4. HF Spaces auto-rebuilds
+```
+**Secrets location:** https://huggingface.co/spaces/mhamdan/SPARKNET/settings
+---
+## Keeping Both in Sync
+If you want both platforms updated:
+```bash
+git push origin main && git push hf main
+```
+Or push to both at once:
+```bash
+git remote add all git@github.com:MHHamdan/SPARKNET.git
+git remote set-url --add all git@hf.co:spaces/mhamdan/SPARKNET.git
+git push all main
+```
+---
+## Backend Connection Flow
+```
+User Browser
+     |
+     v
+Streamlit Cloud (frontend)
+     |
+     | HTTP requests to BACKEND_URL
+     v
+localtunnel (https://selfish-crab-86.loca.lt)
+     |
+     | tunnels to
+     v
+Lytos Server (172.24.50.21:8000)
+     |
+     | processes with
+     v
+PaddleOCR + Ollama + GPU
+```
+---
+## Required Secrets (Streamlit Cloud)
+```toml
+[auth]
+password = "SPARKNET@2026"
+BACKEND_URL = "https://selfish-crab-86.loca.lt"
+GROQ_API_KEY = "your-key"
+HF_TOKEN = "your-token"
+GOOGLE_API_KEY = "your-key"
+OPENROUTER_API_KEY = "your-key"
+MISTRAL_API_KEY = "your-key"
+```
+---
+## Troubleshooting
+| Issue | Solution |
+|-------|----------|
+| Changes not appearing | Reboot app in Streamlit dashboard |
+| Backend connection failed | Check if localtunnel is running (`screen -r lt-tunnel`) |
+| Tunnel URL changed | Update `BACKEND_URL` in Streamlit secrets |
+| PaddleOCR warning | Normal on Streamlit Cloud - backend handles OCR |
+---
+## Screen Sessions on Lytos
+```bash
+screen -ls                    # List sessions
+screen -r sparknet-backend    # Attach to backend
+screen -r lt-tunnel           # Attach to tunnel
+screen -r ollama              # Attach to Ollama
+```