NotebookLMClone / DATABASE_SCHEMA.md
github-actions[bot]
Sync from GitHub e2e802be5157aa05d1251459f529eb7eb4242ef2
dba1a8e

Database Schema

This document reflects the active SQLAlchemy models in data/models.py.

Engine and Initialization

  • ORM: SQLAlchemy 2.x
  • Base class: data.db.Base
  • Default DB: sqlite:///./notebooklm.db
  • Initialization: data.db.init_db()

Relationship Overview

  • users 1:N notebooks
  • notebooks 1:N sources
  • notebooks 1:N chat_threads
  • chat_threads 1:N messages
  • messages 1:N message_citations
  • sources 1:N message_citations
  • notebooks 1:N artifacts

Tables

users

Columns:

  • id INTEGER PK
  • email VARCHAR(255) NOT NULL UNIQUE INDEX
  • display_name VARCHAR(255) NULL
  • avatar_url VARCHAR(1024) NULL
  • created_at DATETIME(timezone=True) NOT NULL

notebooks

Columns:

  • id INTEGER PK
  • owner_user_id INTEGER NOT NULL FK -> users.id ON DELETE CASCADE INDEX
  • title VARCHAR(255) NOT NULL
  • created_at DATETIME(timezone=True) NOT NULL
  • updated_at DATETIME(timezone=True) NOT NULL

sources

Columns:

  • id INTEGER PK
  • notebook_id INTEGER NOT NULL FK -> notebooks.id ON DELETE CASCADE INDEX
  • type VARCHAR(50) NOT NULL
  • title VARCHAR(255) NULL
  • original_name VARCHAR(1024) NULL
  • url VARCHAR(2048) NULL
  • storage_path VARCHAR(1024) NULL
  • status VARCHAR(50) NOT NULL
  • ingested_at DATETIME(timezone=True) NULL

chat_threads

Columns:

  • id INTEGER PK
  • notebook_id INTEGER NOT NULL FK -> notebooks.id ON DELETE CASCADE INDEX
  • title VARCHAR(255) NULL
  • created_at DATETIME(timezone=True) NOT NULL

messages

Columns:

  • id INTEGER PK
  • thread_id INTEGER NOT NULL FK -> chat_threads.id ON DELETE CASCADE INDEX
  • role VARCHAR(20) NOT NULL
  • content TEXT NOT NULL
  • created_at DATETIME(timezone=True) NOT NULL

message_citations

Columns:

  • id INTEGER PK
  • message_id INTEGER NOT NULL FK -> messages.id ON DELETE CASCADE INDEX
  • source_id INTEGER NOT NULL FK -> sources.id ON DELETE CASCADE INDEX
  • chunk_ref VARCHAR(255) NULL
  • quote TEXT NULL
  • score FLOAT NULL

artifacts

Columns:

  • id INTEGER PK
  • notebook_id INTEGER NOT NULL FK -> notebooks.id ON DELETE CASCADE INDEX
  • type VARCHAR(50) NOT NULL
  • title VARCHAR(255) NULL
  • status VARCHAR(50) NOT NULL
  • file_path VARCHAR(1024) NULL
  • metadata JSON NULL (mapped as artifact_metadata)
  • content TEXT NULL
  • error_message TEXT NULL
  • created_at DATETIME(timezone=True) NOT NULL
  • generated_at DATETIME(timezone=True) NULL

Notes

  • Ownership and isolation are anchored by notebooks.owner_user_id.
  • Child records are deleted via ON DELETE CASCADE.
  • Schema creation is currently handled with Base.metadata.create_all(...) (no Alembic yet).