Spaces:
Configuration error
Configuration error
feat: Introduce new backend architecture with notebooks, sources, chat, and CLaRa models, alongside database schema and updated deployment scripts, while removing old frontend, deployment files, and previous backend components.
88f8604
| # π Antigravity Notebook | |
| **A NotebookLM clone powered by Apple's CLaRa-7B-Instruct for infinite context reasoning** | |
| Antigravity Notebook enables you to create "Notebooks" where you can upload multiple disparate sources (PDFs, URLs, Text) and have an AI reason across **all of them simultaneously** using CLaRa's latent compression technology. | |
| ## π Key Features | |
| ### The "Infinite Context" Strategy | |
| - **16x Compression**: CLaRa compresses text into latent representations, reducing context usage by ~16x | |
| - **Whole-Notebook Reasoning**: When all sources fit in context (32k tokens), the AI reads **EVERYTHING** | |
| - **Smart Retrieval**: For larger notebooks, intelligently selects the most relevant sources | |
| - **Multi-Modal Ingestion**: Support for PDFs, URLs, and plain text | |
| ### NotebookLM-Style Interface | |
| - **Notebook Organization**: Group related sources into project notebooks | |
| - **Source Management**: Easy upload, URL scraping, and text input | |
| - **Memory Usage Meter**: Visual gauge showing context utilization | |
| - **Citation Tracking**: See which sources were used for each response | |
| ## ποΈ Architecture | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Streamlit UI β | |
| β (NotebookLM-style interface with sidebar + chat) β | |
| ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ | |
| β | |
| β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β FastAPI Backend β | |
| β β | |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β | |
| β β Notebooks β β Sources β β Chat β β | |
| β β Router β β Router β β Router β β | |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β | |
| ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ | |
| β | |
| ββββββββββββββ΄ββββββββββββββ¬βββββββββββββββ | |
| β β β | |
| βββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ | |
| β CLaRa-7B β β ContextManager β β Storage β | |
| β (Compress & β β (Whole-Context β β Service β | |
| β Generate) β β Strategy) β β (Tensors) β | |
| βββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ | |
| β β β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β PostgreSQL β | |
| β (Notebooks β Sources β LatentTensors β ChatMessages) β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ## π Quick Start | |
| ### Prerequisites | |
| - Python 3.9+ | |
| - Docker & Docker Compose (for PostgreSQL) | |
| - CUDA-capable GPU (recommended, 16GB+ VRAM for CLaRa-7B) | |
| ### Installation | |
| 1. **Clone the repository** | |
| ```bash | |
| git clone <your-repo-url> | |
| cd antigravity-notebook | |
| ``` | |
| 2. **Install dependencies** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Set up environment** | |
| ```bash | |
| cp .env.example .env | |
| # Edit .env with your configuration | |
| ``` | |
| 4. **Start PostgreSQL** | |
| ```bash | |
| docker-compose up -d | |
| ``` | |
| 5. **Initialize database** | |
| ```bash | |
| python -m backend.database | |
| ``` | |
| 6. **Start the backend** | |
| ```bash | |
| python -m backend.main | |
| ``` | |
| 7. **Start the frontend** (in a new terminal) | |
| ```bash | |
| streamlit run frontend/app_notebook.py | |
| ``` | |
| 8. **Open your browser** | |
| - Frontend: http://localhost:8501 | |
| - API Docs: http://localhost:8000/docs | |
| ## π Usage | |
| ### Creating a Notebook | |
| 1. Open the Streamlit UI | |
| 2. Click "Create New Notebook" in the sidebar | |
| 3. Enter a name and description | |
| 4. Click "Create Notebook" | |
| ### Adding Sources | |
| **Upload PDF:** | |
| 1. Select your notebook | |
| 2. Go to "Add Source" β "PDF" tab | |
| 3. Upload your PDF file | |
| 4. Wait for processing (CLaRa compression) | |
| **Add URL:** | |
| 1. Select your notebook | |
| 2. Go to "Add Source" β "URL" tab | |
| 3. Paste the URL | |
| 4. Optionally add a custom title | |
| 5. Click "Add URL" | |
| **Add Text:** | |
| 1. Select your notebook | |
| 2. Go to "Add Source" β "Text" tab | |
| 3. Enter a title and paste your text | |
| 4. Click "Add Text" | |
| ### Querying Your Notebook | |
| 1. Select a notebook with sources | |
| 2. Type your question in the chat input | |
| 3. The AI will reason across ALL your sources | |
| 4. View the response and see which sources were cited | |
| ## π§ How It Works | |
| ### Latent Compression | |
| When you add a source: | |
| 1. Text is extracted (PDF/URL/Text) | |
| 2. Split into 2048-token chunks | |
| 3. Each chunk is compressed by CLaRa into a latent tensor (~128 tokens) | |
| 4. Latent tensors are saved to disk | |
| 5. Metadata is stored in PostgreSQL | |
| ### Context Management | |
| When you query a notebook: | |
| 1. ContextManager fetches ALL latent tensors for the notebook | |
| 2. Calculates total token count | |
| 3. **If β€ 32k tokens**: Stacks ALL tensors β **Whole-Notebook Reasoning** | |
| 4. **If > 32k tokens**: Ranks tensors by relevance, selects top-N β **Selective Retrieval** | |
| 5. Generates response using CLaRa with the selected context | |
| 6. Returns answer with source citations | |
| ## π οΈ API Endpoints | |
| ### Notebooks | |
| - `POST /notebooks/` - Create notebook | |
| - `GET /notebooks/` - List notebooks | |
| - `GET /notebooks/{id}` - Get notebook details | |
| - `GET /notebooks/{id}/stats` - Get context usage stats | |
| - `PATCH /notebooks/{id}` - Update notebook | |
| - `DELETE /notebooks/{id}` - Delete notebook | |
| ### Sources | |
| - `POST /sources/notebooks/{id}/sources/upload` - Upload PDF | |
| - `POST /sources/notebooks/{id}/sources/url` - Add URL | |
| - `POST /sources/notebooks/{id}/sources/text` - Add text | |
| - `GET /sources/notebooks/{id}/sources` - List sources | |
| - `DELETE /sources/{id}` - Delete source | |
| ### Chat | |
| - `POST /chat/notebooks/{id}/chat` - Query notebook | |
| - `GET /chat/notebooks/{id}/messages` - Get chat history | |
| - `DELETE /chat/notebooks/{id}/messages` - Clear chat history | |
| ## π Database Schema | |
| ```sql | |
| notebooks | |
| βββ id (UUID) | |
| βββ name | |
| βββ description | |
| βββ created_at | |
| βββ updated_at | |
| sources | |
| βββ id (UUID) | |
| βββ notebook_id (FK) | |
| βββ source_type (pdf|url|text) | |
| βββ filename | |
| βββ url | |
| βββ content_hash | |
| βββ metadata (JSONB) | |
| latent_tensors | |
| βββ id (UUID) | |
| βββ source_id (FK) | |
| βββ tensor_path | |
| βββ segment_index | |
| βββ token_count | |
| βββ metadata (JSONB) | |
| chat_messages | |
| βββ id (UUID) | |
| βββ notebook_id (FK) | |
| βββ role (user|assistant) | |
| βββ content | |
| βββ sources_used (JSONB) | |
| ``` | |
| ## βοΈ Configuration | |
| Edit `.env` to configure: | |
| ```env | |
| # Database | |
| POSTGRES_USER=antigravity | |
| POSTGRES_PASSWORD=antigravity123 | |
| POSTGRES_DB=antigravity_db | |
| # CLaRa Model | |
| MODEL_NAME=apple/CLaRa-7B-Instruct | |
| DEVICE=cuda # or cpu | |
| MAX_CONTEXT_TOKENS=32768 | |
| COMPRESSION_RATIO=16 | |
| # Storage | |
| LATENT_TENSOR_DIR=./data/latent_tensors | |
| # API | |
| API_PORT=8000 | |
| ``` | |
| ## π― Performance | |
| - **Ingestion**: ~30s for 50-page PDF | |
| - **Query Response**: ~10s for full notebook | |
| - **Capacity**: 10-20 average-sized books per notebook | |
| ## π¬ Technical Details | |
| ### Why CLaRa? | |
| CLaRa (Compressing Long-range Attention) uses latent compression to represent text in a much smaller space, enabling: | |
| - 16x compression ratio | |
| - Preservation of semantic information | |
| - Cross-document reasoning | |
| ### Context Budget | |
| - **Standard**: 32,768 tokens (latent space) | |
| - **Equivalent to**: ~500k original text tokens (with 16x compression) | |
| - **Example**: Can fit 10-20 full books simultaneously | |
| ## π€ Contributing | |
| Contributions welcome! Please open an issue or PR. | |
| ## π License | |
| MIT License - see LICENSE file | |
| ## π Acknowledgments | |
| - Apple for CLaRa-7B-Instruct | |
| - Google for NotebookLM inspiration | |
| - HuggingFace for model hosting | |
| --- | |
| **Built with β€οΈ by the Antigravity Team** | |