RAG-document-assistant / docs /architecture.md
vn6295337's picture
Initial commit: RAG Document Assistant with Zero-Storage Privacy
f866820

RAG Document Assistant - Architecture

Version: 2.0 Last Updated: January 2026 Focus: Zero-Storage Privacy Architecture


System Overview

A privacy-first RAG (Retrieval-Augmented Generation) system where no document text is ever stored on our servers. Documents are processed client-side, and text is re-fetched from the user's cloud storage at query time.

Key Characteristics

  • Zero-Storage: Document text never persists on servers
  • Client-Side Processing: Chunking happens in the browser
  • Query-Time Re-fetch: Text retrieved from user's Dropbox for each search
  • User Control: Disconnect cloud storage to revoke all access

Privacy Architecture

INDEXING (one-time setup)
══════════════════════════════════════════════════════════════════

  User's Browser                              Our Server
  ──────────────                              ──────────

  1. Connect Dropbox (OAuth)
           β”‚
           β–Ό
  2. Select files from Dropbox
           β”‚
           β–Ό
  3. Files loaded in browser
     (never sent to server)
           β”‚
           β–Ό
  4. Text chunked locally ───────────────► 5. Generate embeddings
     with position tracking                    (384-dim vectors)
           β”‚                                        β”‚
           β–Ό                                        β–Ό
  6. Original text                          7. Store in Pinecone:
     PURGED from memory                        - Embeddings (irreversible)
                                               - File paths
                                               - Chunk positions
                                               - NO TEXT

══════════════════════════════════════════════════════════════════

QUERY TIME (every search)
══════════════════════════════════════════════════════════════════

  User's Question                             Our Server
  ───────────────                             ──────────

  "What does the contract say?"
           β”‚
           β–Ό
  ─────────────────────────────────────► 1. Generate query embedding
                                              β”‚
                                              β–Ό
                                         2. Search Pinecone
                                            (find similar chunks)
                                              β”‚
                                              β–Ό
                                         3. Get file paths + positions
                                              β”‚
                                              β–Ό
                                         4. Re-fetch from USER'S Dropbox
                                            using their access token
                                              β”‚
                                              β–Ό
                                         5. Extract chunk text
                                            using stored positions
                                              β”‚
                                              β–Ό
                                         6. Send to LLM for answer
                                              β”‚
                                              β–Ό
  Answer + Citations ◄─────────────────  7. Return response
                                            (text never stored)

══════════════════════════════════════════════════════════════════

What Gets Stored

Data Stored? Where Reversible?
Document files No User's Dropbox only N/A
Document text No Never stored N/A
Embeddings Yes Pinecone No (one-way transform)
File paths Yes Pinecone metadata N/A
Chunk positions Yes Pinecone metadata N/A
User queries No Not logged N/A

Technology Stack

Frontend

  • Framework: React 18 + Vite
  • Styling: Tailwind CSS v4
  • Deployment: Vercel
  • Key Features:
    • Client-side text chunking
    • Dropbox OAuth integration
    • Position tracking for chunks

Backend

  • Framework: FastAPI
  • Deployment: HuggingFace Spaces (Docker)
  • Key Features:
    • Zero-storage embedding endpoint
    • Query-time Dropbox re-fetch
    • Multi-provider LLM cascade

Vector Database

  • Service: Pinecone Serverless
  • Index: rag-semantic-384
  • Dimensions: 384
  • Metric: Cosine similarity

Embeddings

  • Model: all-MiniLM-L6-v2 (sentence-transformers)
  • Dimensions: 384
  • Processing: Server-side (text discarded immediately)

LLM Providers (Cascade)

  1. Gemini 2.5 Flash (Primary)
  2. Groq - llama-3.1-8b-instant (Fallback 1)
  3. OpenRouter - Mistral 7B (Fallback 2)

Component Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         FRONTEND (React)                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚  β”‚   Sidebar    β”‚  β”‚  QueryPanel  β”‚  β”‚   App.jsx    β”‚          β”‚
β”‚  β”‚              β”‚  β”‚              β”‚  β”‚              β”‚          β”‚
β”‚  β”‚ - CloudConnect  β”‚ - Search UI   β”‚ - State mgmt  β”‚          β”‚
β”‚  β”‚ - File select   β”‚ - Results     β”‚ - Token flow  β”‚          β”‚
β”‚  β”‚ - Index button  β”‚ - Citations   β”‚ - Privacy UI  β”‚          β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚  β”‚                  API Layer                        β”‚           β”‚
β”‚  β”‚  chunker.js  β”‚  dropbox.js  β”‚  client.js         β”‚           β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚                              β”‚                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚ HTTPS
                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        BACKEND (FastAPI)                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚  β”‚                   API Routes                        β”‚         β”‚
β”‚  β”‚                                                     β”‚         β”‚
β”‚  β”‚  POST /embed-chunks    - Generate embeddings        β”‚         β”‚
β”‚  β”‚  POST /query-secure    - Zero-storage query         β”‚         β”‚
β”‚  β”‚  POST /dropbox/token   - OAuth token exchange       β”‚         β”‚
β”‚  β”‚  POST /dropbox/folder  - List folder contents       β”‚         β”‚
β”‚  β”‚  POST /dropbox/file    - Download file content      β”‚         β”‚
β”‚  β”‚  DELETE /clear-index   - Clear Pinecone index       β”‚         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
β”‚                              β”‚                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚                β”‚                β”‚
              β–Ό                β–Ό                β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Pinecone β”‚    β”‚ Dropbox  β”‚    β”‚   LLM    β”‚
        β”‚ (vectors)β”‚    β”‚  (files) β”‚    β”‚ Providersβ”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow: Indexing

# 1. User selects files in browser
files = [
    {id: "abc123", name: "contract.pdf", path: "/Documents/contract.pdf"}
]

# 2. Files fetched from Dropbox (via backend proxy)
content = await fetch("/api/dropbox/file", {path: file.path, access_token})

# 3. Text chunked CLIENT-SIDE with position tracking
chunks = chunkText(content, {chunkSize: 1000, overlap: 100})
# Result:
# {text: "...", startChar: 0, endChar: 1000}
# {text: "...", startChar: 900, endChar: 1900}

# 4. Chunks sent to backend for embedding
await fetch("/api/embed-chunks", {
    chunks: [{
        text: "...",  // Used for embedding only
        metadata: {
            filename: "contract.pdf",
            filePath: "/Documents/contract.pdf",
            fileId: "abc123",
            startChar: 0,
            endChar: 1000
        }
    }]
})

# 5. Backend generates embeddings, stores in Pinecone
# TEXT IS IMMEDIATELY DISCARDED
pinecone.upsert({
    id: "abc123::0",
    values: [0.123, -0.456, ...],  # 384-dim embedding
    metadata: {
        filename: "contract.pdf",
        file_path: "/Documents/contract.pdf",
        file_id: "abc123",
        start_char: 0,
        end_char: 1000
        # NO TEXT STORED
    }
})

Data Flow: Query

# 1. User submits query with access token
request = {
    query: "What is the payment term?",
    access_token: "user_dropbox_token"
}

# 2. Generate query embedding
query_embedding = sentence_transformer.encode(query)

# 3. Search Pinecone
results = pinecone.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)
# Returns: file paths + positions (NO TEXT)

# 4. Re-fetch files from USER'S Dropbox
for file_path in unique_file_paths:
    content = dropbox.download(file_path, access_token)

    # 5. Extract chunks using stored positions
    for chunk in chunks_from_file:
        text = content[chunk.start_char:chunk.end_char]

# 6. Build prompt with re-fetched text
prompt = f"""
Context:
1. {chunk1_text}
2. {chunk2_text}

Question: {query}
"""

# 7. Call LLM
answer = llm.generate(prompt)

# 8. Return answer (text never stored)
return {answer, citations}

Security & Privacy

User Control

  • OAuth Scopes: Read-only access to user-selected files
  • Token Storage: Access token stored only in browser session
  • Revocation: Disconnect Dropbox = immediate access revocation

Server Security

  • No Persistent Storage: Text never written to disk or database
  • Memory Only: Text exists in memory only during processing
  • Immediate Purge: Explicit deletion after embedding generation

Data Protection

  • Embeddings: One-way transformation, cannot reconstruct text
  • Positions: Only useful with original file access
  • File Paths: Dropbox paths, require valid access token

Deployment

Frontend (Vercel)

  • Automatic deploys from GitHub
  • Environment: VITE_API_URL pointing to backend

Backend (HuggingFace Spaces)

  • Docker-based deployment
  • Environment variables for API keys:
    • PINECONE_API_KEY
    • DROPBOX_APP_KEY
    • DROPBOX_APP_SECRET
    • GEMINI_API_KEY
    • GROQ_API_KEY

References