A newer version of the Gradio SDK is available: 6.11.0
NotebookLM Clone - Handoff Document
Stack
- Auth: Hugging Face OAuth (
gr.LoginButton,user_id= HF username) - Metadata: Supabase (notebooks, messages, artifacts)
- Files: Supabase Storage bucket
notebooklm - Vectors: Supabase pgvector (chunks table)
Setup
1. Supabase
- Run
db/schema.sqlin SQL Editor - Create Storage bucket: Storage β New bucket β name
notebooklm, set public/private as needed - Add RLS policies for the bucket if using private access
2. HF Space
- Add
hf_oauth: truein README (already done) - Add
SUPABASE_URL,SUPABASE_KEY(service role) as Space secrets - Optional:
SUPABASE_BUCKET(default: notebooklm)
3. Local
HF_TOKENenv var orhuggingface-cli login(required for OAuth mock).envwithSUPABASE_URL,SUPABASE_KEYpip install gradio[oauth](oritsdangerous) for LoginButton
Storage (Supabase Storage)
from backend.storage import get_sources_path, save_file, load_file
# Ingestion: save uploaded PDF
prefix = get_sources_path(user_id, notebook_id) # "user_id/notebook_id/sources"
path = f"{prefix}/document.pdf"
save_file(path, file_bytes)
# Load
data = load_file(path)
Paths: {user_id}/{notebook_id}/sources|embeddings|chats|artifacts}/{filename}
Notebook API
create_notebook(user_id, name)list_notebooks(user_id)rename_notebook(user_id, notebook_id, new_name)delete_notebook(user_id, notebook_id)
Chat (Supabase messages table)
save_message(notebook_id, role, content)load_chat(notebook_id)
Embeddings (pgvector)
Table chunks: id, notebook_id, source_id, content, embedding vector(1536), metadata, created_at.
Ingestion team: embed chunks, insert into chunks, filter by notebook_id for retrieval.
Handover: Ingestion & RAG Builders
Where to Write Your Code
| Responsibility | File / Location | Purpose |
|---|---|---|
| Ingestion | backend/ingestion_service.py (create this) |
Parse uploaded files, chunk text, compute embeddings, insert into chunks |
| RAG | backend/rag_service.py (create this) |
Embed query β similarity search β build context β call LLM β return answer |
| Storage | backend/storage.py (existing) |
Save/load files in Supabase Storage; do not modify |
| Chat | backend/chat_service.py (existing) |
Save/load messages; RAG calls save_message and load_chat |
| UI | app.py |
Add upload component + chat interface; wire to ingestion and RAG |
Ingestion Builder
Write your code in: backend/ingestion_service.py
Flow:
- Receive:
user_id,notebook_id, uploaded file bytes, and filename. - Save raw file via storage:
from backend.storage import get_sources_path, save_file prefix = get_sources_path(user_id, notebook_id) # β "user_id/notebook_id/sources" path = f"{prefix}/{filename}" save_file(path, file_bytes) - Parse file (PDF, DOCX, TXT, etc.) and extract text.
- Chunk text (e.g., 512β1024 tokens with overlap).
- Compute embeddings (e.g., OpenAI
text-embedding-3-smallβ 1536 dims, or compatible). - Insert rows into
chunks:supabase.table("chunks").insert({ "notebook_id": notebook_id, "source_id": path, # or your source identifier "content": chunk_text, "embedding": embedding_list, # list of 1536 floats "metadata": {"page": 1, "chunk_idx": 0} # optional }).execute()
Integrate in app:
- Add
gr.Fileorgr.Uploadinapp.pyfor the selected notebook. - On upload, call
ingest_file(user_id, notebook_id, file_bytes, filename)from your new service.
Existing helpers: backend/storage (save_file, load_file, list_files, get_sources_path).
RAG Builder
Write your code in: backend/rag_service.py
Flow:
- Receive:
notebook_id, user query. - Embed the query (same model/dims as ingestion, e.g. 1536).
- Similarity search in
chunks:# Supabase pgvector example (cosine similarity) result = supabase.rpc( "match_chunks", {"query_embedding": embedding, "match_count": 5, "p_notebook_id": notebook_id} ).execute()- You must add a Supabase function
match_chunksthat filters bynotebook_idand runs vector similarity (or use raw SQL). - Alternative: use
supabase.table("chunks").select("*").eq("notebook_id", notebook_id)and do similarity in Python (less efficient).
- You must add a Supabase function
- Build context from top-k chunks.
- Call LLM (Hugging Face Inference API, OpenAI, etc.) with context + history.
- Persist messages via
chat_service:from backend.chat_service import save_message, load_chat save_message(notebook_id, "user", query) save_message(notebook_id, "assistant", answer)
Integrate in app:
- Add a chat block in
app.py(Chatbot component) tied toselected_notebook_id. - On submit: call
rag_chat(notebook_id, query, chat_history)β returns assistant reply; update history usingload_chat(notebook_id)or append locally.
Existing helpers: backend/chat_service (save_message, load_chat), backend/db (supabase).
Schema Reference (for both)
-- chunks table (db/schema.sql)
chunks (
id uuid,
notebook_id uuid,
source_id text,
content text,
embedding vector(1536),
metadata jsonb,
created_at timestamptz
)
Required: embedding must be 1536 dimensions (or update schema if using a different model).
Suggested RPC for RAG (optional)
Add this in Supabase SQL Editor if you prefer server-side similarity:
create or replace function match_chunks(
query_embedding vector(1536),
match_count int,
p_notebook_id uuid
)
returns table (id uuid, content text, metadata jsonb, similarity float)
language plpgsql as $$
begin
return query
select c.id, c.content, c.metadata,
1 - (c.embedding <=> query_embedding) as similarity
from chunks c
where c.notebook_id = p_notebook_id
order by c.embedding <=> query_embedding
limit match_count;
end;
$$;
Ingestion writes to chunks; RAG reads via match_chunks or equivalent.