# RAG Document Assistant - Architecture

> **Version**: 2.0
> **Last Updated**: January 2026
> **Focus**: Zero-Storage Privacy Architecture

---

## System Overview

A privacy-first RAG (Retrieval-Augmented Generation) system where **no document text is ever stored on our servers**. Documents are processed client-side, and text is re-fetched from the user's cloud storage at query time.

### Key Characteristics

- **Zero-Storage**: Document text never persists on servers
- **Client-Side Processing**: Chunking happens in the browser
- **Query-Time Re-fetch**: Text retrieved from user's Dropbox for each search
- **User Control**: Disconnect cloud storage to revoke all access

---

## Privacy Architecture

```
INDEXING (one-time setup)
══════════════════════════════════════════════════════════════════

  User's Browser                              Our Server
  ──────────────                              ──────────

  1. Connect Dropbox (OAuth)
           │
           ▼
  2. Select files from Dropbox
           │
           ▼
  3. Files loaded in browser
     (never sent to server)
           │
           ▼
  4. Text chunked locally ───────────────► 5. Generate embeddings
     with position tracking                    (384-dim vectors)
           │                                        │
           ▼                                        ▼
  6. Original text                          7. Store in Pinecone:
     PURGED from memory                        - Embeddings (irreversible)
                                               - File paths
                                               - Chunk positions
                                               - NO TEXT

══════════════════════════════════════════════════════════════════

QUERY TIME (every search)
══════════════════════════════════════════════════════════════════

  User's Question                             Our Server
  ───────────────                             ──────────

  "What does the contract say?"
           │
           ▼
  ─────────────────────────────────────► 1. Generate query embedding
                                              │
                                              ▼
                                         2. Search Pinecone
                                            (find similar chunks)
                                              │
                                              ▼
                                         3. Get file paths + positions
                                              │
                                              ▼
                                         4. Re-fetch from USER'S Dropbox
                                            using their access token
                                              │
                                              ▼
                                         5. Extract chunk text
                                            using stored positions
                                              │
                                              ▼
                                         6. Send to LLM for answer
                                              │
                                              ▼
  Answer + Citations ◄─────────────────  7. Return response
                                            (text never stored)

══════════════════════════════════════════════════════════════════
```

---

## What Gets Stored

| Data | Stored? | Where | Reversible? |
|------|---------|-------|-------------|
| Document files | No | User's Dropbox only | N/A |
| Document text | No | Never stored | N/A |
| Embeddings | Yes | Pinecone | No (one-way transform) |
| File paths | Yes | Pinecone metadata | N/A |
| Chunk positions | Yes | Pinecone metadata | N/A |
| User queries | No | Not logged | N/A |

---

## Technology Stack

### Frontend
- **Framework**: React 18 + Vite
- **Styling**: Tailwind CSS v4
- **Deployment**: Vercel
- **Key Features**:
  - Client-side text chunking
  - Dropbox OAuth integration
  - Position tracking for chunks

### Backend
- **Framework**: FastAPI
- **Deployment**: HuggingFace Spaces (Docker)
- **Key Features**:
  - Zero-storage embedding endpoint
  - Query-time Dropbox re-fetch
  - Multi-provider LLM cascade

### Vector Database
- **Service**: Pinecone Serverless
- **Index**: `rag-semantic-384`
- **Dimensions**: 384
- **Metric**: Cosine similarity

### Embeddings
- **Model**: `all-MiniLM-L6-v2` (sentence-transformers)
- **Dimensions**: 384
- **Processing**: Server-side (text discarded immediately)

### LLM Providers (Cascade)
1. **Gemini 2.5 Flash** (Primary)
2. **Groq** - llama-3.1-8b-instant (Fallback 1)
3. **OpenRouter** - Mistral 7B (Fallback 2)

---

## Component Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                         FRONTEND (React)                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │   Sidebar    │  │  QueryPanel  │  │   App.jsx    │          │
│  │              │  │              │  │              │          │
│  │ - CloudConnect  │ - Search UI   │ - State mgmt  │          │
│  │ - File select   │ - Results     │ - Token flow  │          │
│  │ - Index button  │ - Citations   │ - Privacy UI  │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│                                                                  │
│  ┌──────────────────────────────────────────────────┐           │
│  │                  API Layer                        │           │
│  │  chunker.js  │  dropbox.js  │  client.js         │           │
│  └──────────────────────────────────────────────────┘           │
│                              │                                   │
└──────────────────────────────┼───────────────────────────────────┘
                               │ HTTPS
                               ▼
┌─────────────────────────────────────────────────────────────────┐
│                        BACKEND (FastAPI)                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌────────────────────────────────────────────────────┐         │
│  │                   API Routes                        │         │
│  │                                                     │         │
│  │  POST /embed-chunks    - Generate embeddings        │         │
│  │  POST /query-secure    - Zero-storage query         │         │
│  │  POST /dropbox/token   - OAuth token exchange       │         │
│  │  POST /dropbox/folder  - List folder contents       │         │
│  │  POST /dropbox/file    - Download file content      │         │
│  │  DELETE /clear-index   - Clear Pinecone index       │         │
│  └────────────────────────────────────────────────────┘         │
│                              │                                   │
└──────────────────────────────┼───────────────────────────────────┘
                               │
              ┌────────────────┼────────────────┐
              │                │                │
              ▼                ▼                ▼
        ┌──────────┐    ┌──────────┐    ┌──────────┐
        │ Pinecone │    │ Dropbox  │    │   LLM    │
        │ (vectors)│    │  (files) │    │ Providers│
        └──────────┘    └──────────┘    └──────────┘
```

---

## Data Flow: Indexing

```python
# 1. User selects files in browser
files = [
    {id: "abc123", name: "contract.pdf", path: "/Documents/contract.pdf"}
]

# 2. Files fetched from Dropbox (via backend proxy)
content = await fetch("/api/dropbox/file", {path: file.path, access_token})

# 3. Text chunked CLIENT-SIDE with position tracking
chunks = chunkText(content, {chunkSize: 1000, overlap: 100})
# Result:
# {text: "...", startChar: 0, endChar: 1000}
# {text: "...", startChar: 900, endChar: 1900}

# 4. Chunks sent to backend for embedding
await fetch("/api/embed-chunks", {
    chunks: [{
        text: "...",  // Used for embedding only
        metadata: {
            filename: "contract.pdf",
            filePath: "/Documents/contract.pdf",
            fileId: "abc123",
            startChar: 0,
            endChar: 1000
        }
    }]
})

# 5. Backend generates embeddings, stores in Pinecone
# TEXT IS IMMEDIATELY DISCARDED
pinecone.upsert({
    id: "abc123::0",
    values: [0.123, -0.456, ...],  # 384-dim embedding
    metadata: {
        filename: "contract.pdf",
        file_path: "/Documents/contract.pdf",
        file_id: "abc123",
        start_char: 0,
        end_char: 1000
        # NO TEXT STORED
    }
})
```

---

## Data Flow: Query

```python
# 1. User submits query with access token
request = {
    query: "What is the payment term?",
    access_token: "user_dropbox_token"
}

# 2. Generate query embedding
query_embedding = sentence_transformer.encode(query)

# 3. Search Pinecone
results = pinecone.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)
# Returns: file paths + positions (NO TEXT)

# 4. Re-fetch files from USER'S Dropbox
for file_path in unique_file_paths:
    content = dropbox.download(file_path, access_token)

    # 5. Extract chunks using stored positions
    for chunk in chunks_from_file:
        text = content[chunk.start_char:chunk.end_char]

# 6. Build prompt with re-fetched text
prompt = f"""
Context:
1. {chunk1_text}
2. {chunk2_text}

Question: {query}
"""

# 7. Call LLM
answer = llm.generate(prompt)

# 8. Return answer (text never stored)
return {answer, citations}
```

---

## Security & Privacy

### User Control
- **OAuth Scopes**: Read-only access to user-selected files
- **Token Storage**: Access token stored only in browser session
- **Revocation**: Disconnect Dropbox = immediate access revocation

### Server Security
- **No Persistent Storage**: Text never written to disk or database
- **Memory Only**: Text exists in memory only during processing
- **Immediate Purge**: Explicit deletion after embedding generation

### Data Protection
- **Embeddings**: One-way transformation, cannot reconstruct text
- **Positions**: Only useful with original file access
- **File Paths**: Dropbox paths, require valid access token

---

## Deployment

### Frontend (Vercel)
- Automatic deploys from GitHub
- Environment: `VITE_API_URL` pointing to backend

### Backend (HuggingFace Spaces)
- Docker-based deployment
- Environment variables for API keys:
  - `PINECONE_API_KEY`
  - `DROPBOX_APP_KEY`
  - `DROPBOX_APP_SECRET`
  - `GEMINI_API_KEY`
  - `GROQ_API_KEY`

---

## References

- **Live Demo**: https://rag-document-assistant.vercel.app/
- **Backend API**: https://vn6295337-rag-document-assistant.hf.space/
- **GitHub**: https://github.com/vn6295337/RAG-document-assistant