Spaces:
Sleeping
Sleeping
RAG Document Assistant - Architecture
Version: 2.0 Last Updated: January 2026 Focus: Zero-Storage Privacy Architecture
System Overview
A privacy-first RAG (Retrieval-Augmented Generation) system where no document text is ever stored on our servers. Documents are processed client-side, and text is re-fetched from the user's cloud storage at query time.
Key Characteristics
- Zero-Storage: Document text never persists on servers
- Client-Side Processing: Chunking happens in the browser
- Query-Time Re-fetch: Text retrieved from user's Dropbox for each search
- User Control: Disconnect cloud storage to revoke all access
Privacy Architecture
INDEXING (one-time setup)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
User's Browser Our Server
ββββββββββββββ ββββββββββ
1. Connect Dropbox (OAuth)
β
βΌ
2. Select files from Dropbox
β
βΌ
3. Files loaded in browser
(never sent to server)
β
βΌ
4. Text chunked locally ββββββββββββββββΊ 5. Generate embeddings
with position tracking (384-dim vectors)
β β
βΌ βΌ
6. Original text 7. Store in Pinecone:
PURGED from memory - Embeddings (irreversible)
- File paths
- Chunk positions
- NO TEXT
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
QUERY TIME (every search)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
User's Question Our Server
βββββββββββββββ ββββββββββ
"What does the contract say?"
β
βΌ
ββββββββββββββββββββββββββββββββββββββΊ 1. Generate query embedding
β
βΌ
2. Search Pinecone
(find similar chunks)
β
βΌ
3. Get file paths + positions
β
βΌ
4. Re-fetch from USER'S Dropbox
using their access token
β
βΌ
5. Extract chunk text
using stored positions
β
βΌ
6. Send to LLM for answer
β
βΌ
Answer + Citations ββββββββββββββββββ 7. Return response
(text never stored)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
What Gets Stored
| Data | Stored? | Where | Reversible? |
|---|---|---|---|
| Document files | No | User's Dropbox only | N/A |
| Document text | No | Never stored | N/A |
| Embeddings | Yes | Pinecone | No (one-way transform) |
| File paths | Yes | Pinecone metadata | N/A |
| Chunk positions | Yes | Pinecone metadata | N/A |
| User queries | No | Not logged | N/A |
Technology Stack
Frontend
- Framework: React 18 + Vite
- Styling: Tailwind CSS v4
- Deployment: Vercel
- Key Features:
- Client-side text chunking
- Dropbox OAuth integration
- Position tracking for chunks
Backend
- Framework: FastAPI
- Deployment: HuggingFace Spaces (Docker)
- Key Features:
- Zero-storage embedding endpoint
- Query-time Dropbox re-fetch
- Multi-provider LLM cascade
Vector Database
- Service: Pinecone Serverless
- Index:
rag-semantic-384 - Dimensions: 384
- Metric: Cosine similarity
Embeddings
- Model:
all-MiniLM-L6-v2(sentence-transformers) - Dimensions: 384
- Processing: Server-side (text discarded immediately)
LLM Providers (Cascade)
- Gemini 2.5 Flash (Primary)
- Groq - llama-3.1-8b-instant (Fallback 1)
- OpenRouter - Mistral 7B (Fallback 2)
Component Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FRONTEND (React) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Sidebar β β QueryPanel β β App.jsx β β
β β β β β β β β
β β - CloudConnect β - Search UI β - State mgmt β β
β β - File select β - Results β - Token flow β β
β β - Index button β - Citations β - Privacy UI β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β API Layer β β
β β chunker.js β dropbox.js β client.js β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
ββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β HTTPS
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BACKEND (FastAPI) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β API Routes β β
β β β β
β β POST /embed-chunks - Generate embeddings β β
β β POST /query-secure - Zero-storage query β β
β β POST /dropbox/token - OAuth token exchange β β
β β POST /dropbox/folder - List folder contents β β
β β POST /dropbox/file - Download file content β β
β β DELETE /clear-index - Clear Pinecone index β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
ββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββΌβββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β Pinecone β β Dropbox β β LLM β
β (vectors)β β (files) β β Providersβ
ββββββββββββ ββββββββββββ ββββββββββββ
Data Flow: Indexing
# 1. User selects files in browser
files = [
{id: "abc123", name: "contract.pdf", path: "/Documents/contract.pdf"}
]
# 2. Files fetched from Dropbox (via backend proxy)
content = await fetch("/api/dropbox/file", {path: file.path, access_token})
# 3. Text chunked CLIENT-SIDE with position tracking
chunks = chunkText(content, {chunkSize: 1000, overlap: 100})
# Result:
# {text: "...", startChar: 0, endChar: 1000}
# {text: "...", startChar: 900, endChar: 1900}
# 4. Chunks sent to backend for embedding
await fetch("/api/embed-chunks", {
chunks: [{
text: "...", // Used for embedding only
metadata: {
filename: "contract.pdf",
filePath: "/Documents/contract.pdf",
fileId: "abc123",
startChar: 0,
endChar: 1000
}
}]
})
# 5. Backend generates embeddings, stores in Pinecone
# TEXT IS IMMEDIATELY DISCARDED
pinecone.upsert({
id: "abc123::0",
values: [0.123, -0.456, ...], # 384-dim embedding
metadata: {
filename: "contract.pdf",
file_path: "/Documents/contract.pdf",
file_id: "abc123",
start_char: 0,
end_char: 1000
# NO TEXT STORED
}
})
Data Flow: Query
# 1. User submits query with access token
request = {
query: "What is the payment term?",
access_token: "user_dropbox_token"
}
# 2. Generate query embedding
query_embedding = sentence_transformer.encode(query)
# 3. Search Pinecone
results = pinecone.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
# Returns: file paths + positions (NO TEXT)
# 4. Re-fetch files from USER'S Dropbox
for file_path in unique_file_paths:
content = dropbox.download(file_path, access_token)
# 5. Extract chunks using stored positions
for chunk in chunks_from_file:
text = content[chunk.start_char:chunk.end_char]
# 6. Build prompt with re-fetched text
prompt = f"""
Context:
1. {chunk1_text}
2. {chunk2_text}
Question: {query}
"""
# 7. Call LLM
answer = llm.generate(prompt)
# 8. Return answer (text never stored)
return {answer, citations}
Security & Privacy
User Control
- OAuth Scopes: Read-only access to user-selected files
- Token Storage: Access token stored only in browser session
- Revocation: Disconnect Dropbox = immediate access revocation
Server Security
- No Persistent Storage: Text never written to disk or database
- Memory Only: Text exists in memory only during processing
- Immediate Purge: Explicit deletion after embedding generation
Data Protection
- Embeddings: One-way transformation, cannot reconstruct text
- Positions: Only useful with original file access
- File Paths: Dropbox paths, require valid access token
Deployment
Frontend (Vercel)
- Automatic deploys from GitHub
- Environment:
VITE_API_URLpointing to backend
Backend (HuggingFace Spaces)
- Docker-based deployment
- Environment variables for API keys:
PINECONE_API_KEYDROPBOX_APP_KEYDROPBOX_APP_SECRETGEMINI_API_KEYGROQ_API_KEY