# 📚 Antigravity Notebook **A NotebookLM clone powered by Apple's CLaRa-7B-Instruct for infinite context reasoning** Antigravity Notebook enables you to create "Notebooks" where you can upload multiple disparate sources (PDFs, URLs, Text) and have an AI reason across **all of them simultaneously** using CLaRa's latent compression technology. ## 🌟 Key Features ### The "Infinite Context" Strategy - **16x Compression**: CLaRa compresses text into latent representations, reducing context usage by ~16x - **Whole-Notebook Reasoning**: When all sources fit in context (32k tokens), the AI reads **EVERYTHING** - **Smart Retrieval**: For larger notebooks, intelligently selects the most relevant sources - **Multi-Modal Ingestion**: Support for PDFs, URLs, and plain text ### NotebookLM-Style Interface - **Notebook Organization**: Group related sources into project notebooks - **Source Management**: Easy upload, URL scraping, and text input - **Memory Usage Meter**: Visual gauge showing context utilization - **Citation Tracking**: See which sources were used for each response ## 🏗️ Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ Streamlit UI │ │ (NotebookLM-style interface with sidebar + chat) │ └────────────────────┬────────────────────────────────────┘ │ ↓ ┌─────────────────────────────────────────────────────────┐ │ FastAPI Backend │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Notebooks │ │ Sources │ │ Chat │ │ │ │ Router │ │ Router │ │ Router │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └────────────────────┬────────────────────────────────────┘ │ ┌────────────┴─────────────┬──────────────┐ ↓ ↓ ↓ ┌───────────────┐ ┌──────────────────┐ ┌──────────────┐ │ CLaRa-7B │ │ ContextManager │ │ Storage │ │ (Compress & │ │ (Whole-Context │ │ Service │ │ Generate) │ │ Strategy) │ │ (Tensors) │ └───────────────┘ └──────────────────┘ └──────────────┘ ↓ ↓ ↓ ┌─────────────────────────────────────────────────────────┐ │ PostgreSQL │ │ (Notebooks → Sources → LatentTensors → ChatMessages) │ └─────────────────────────────────────────────────────────┘ ``` ## 🚀 Quick Start ### Prerequisites - Python 3.9+ - Docker & Docker Compose (for PostgreSQL) - CUDA-capable GPU (recommended, 16GB+ VRAM for CLaRa-7B) ### Installation 1. **Clone the repository** ```bash git clone cd antigravity-notebook ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Set up environment** ```bash cp .env.example .env # Edit .env with your configuration ``` 4. **Start PostgreSQL** ```bash docker-compose up -d ``` 5. **Initialize database** ```bash python -m backend.database ``` 6. **Start the backend** ```bash python -m backend.main ``` 7. **Start the frontend** (in a new terminal) ```bash streamlit run frontend/app_notebook.py ``` 8. **Open your browser** - Frontend: http://localhost:8501 - API Docs: http://localhost:8000/docs ## 📖 Usage ### Creating a Notebook 1. Open the Streamlit UI 2. Click "Create New Notebook" in the sidebar 3. Enter a name and description 4. Click "Create Notebook" ### Adding Sources **Upload PDF:** 1. Select your notebook 2. Go to "Add Source" → "PDF" tab 3. Upload your PDF file 4. Wait for processing (CLaRa compression) **Add URL:** 1. Select your notebook 2. Go to "Add Source" → "URL" tab 3. Paste the URL 4. Optionally add a custom title 5. Click "Add URL" **Add Text:** 1. Select your notebook 2. Go to "Add Source" → "Text" tab 3. Enter a title and paste your text 4. Click "Add Text" ### Querying Your Notebook 1. Select a notebook with sources 2. Type your question in the chat input 3. The AI will reason across ALL your sources 4. View the response and see which sources were cited ## 🧠 How It Works ### Latent Compression When you add a source: 1. Text is extracted (PDF/URL/Text) 2. Split into 2048-token chunks 3. Each chunk is compressed by CLaRa into a latent tensor (~128 tokens) 4. Latent tensors are saved to disk 5. Metadata is stored in PostgreSQL ### Context Management When you query a notebook: 1. ContextManager fetches ALL latent tensors for the notebook 2. Calculates total token count 3. **If ≤ 32k tokens**: Stacks ALL tensors → **Whole-Notebook Reasoning** 4. **If > 32k tokens**: Ranks tensors by relevance, selects top-N → **Selective Retrieval** 5. Generates response using CLaRa with the selected context 6. Returns answer with source citations ## 🛠️ API Endpoints ### Notebooks - `POST /notebooks/` - Create notebook - `GET /notebooks/` - List notebooks - `GET /notebooks/{id}` - Get notebook details - `GET /notebooks/{id}/stats` - Get context usage stats - `PATCH /notebooks/{id}` - Update notebook - `DELETE /notebooks/{id}` - Delete notebook ### Sources - `POST /sources/notebooks/{id}/sources/upload` - Upload PDF - `POST /sources/notebooks/{id}/sources/url` - Add URL - `POST /sources/notebooks/{id}/sources/text` - Add text - `GET /sources/notebooks/{id}/sources` - List sources - `DELETE /sources/{id}` - Delete source ### Chat - `POST /chat/notebooks/{id}/chat` - Query notebook - `GET /chat/notebooks/{id}/messages` - Get chat history - `DELETE /chat/notebooks/{id}/messages` - Clear chat history ## 📊 Database Schema ```sql notebooks ├── id (UUID) ├── name ├── description ├── created_at └── updated_at sources ├── id (UUID) ├── notebook_id (FK) ├── source_type (pdf|url|text) ├── filename ├── url ├── content_hash └── metadata (JSONB) latent_tensors ├── id (UUID) ├── source_id (FK) ├── tensor_path ├── segment_index ├── token_count └── metadata (JSONB) chat_messages ├── id (UUID) ├── notebook_id (FK) ├── role (user|assistant) ├── content └── sources_used (JSONB) ``` ## ⚙️ Configuration Edit `.env` to configure: ```env # Database POSTGRES_USER=antigravity POSTGRES_PASSWORD=antigravity123 POSTGRES_DB=antigravity_db # CLaRa Model MODEL_NAME=apple/CLaRa-7B-Instruct DEVICE=cuda # or cpu MAX_CONTEXT_TOKENS=32768 COMPRESSION_RATIO=16 # Storage LATENT_TENSOR_DIR=./data/latent_tensors # API API_PORT=8000 ``` ## 🎯 Performance - **Ingestion**: ~30s for 50-page PDF - **Query Response**: ~10s for full notebook - **Capacity**: 10-20 average-sized books per notebook ## 🔬 Technical Details ### Why CLaRa? CLaRa (Compressing Long-range Attention) uses latent compression to represent text in a much smaller space, enabling: - 16x compression ratio - Preservation of semantic information - Cross-document reasoning ### Context Budget - **Standard**: 32,768 tokens (latent space) - **Equivalent to**: ~500k original text tokens (with 16x compression) - **Example**: Can fit 10-20 full books simultaneously ## 🤝 Contributing Contributions welcome! Please open an issue or PR. ## 📝 License MIT License - see LICENSE file ## 🙏 Acknowledgments - Apple for CLaRa-7B-Instruct - Google for NotebookLM inspiration - HuggingFace for model hosting --- **Built with ❤️ by the Antigravity Team**