Spaces:

Athena1621
/

translation_app

Configuration error

translation_app / QUICKSTART.md

feat: Introduce new backend architecture with notebooks, sources, chat, and CLaRa models, alongside database schema and updated deployment scripts, while removing old frontend, deployment files, and previous backend components.

88f8604 about 2 months ago

preview code

raw

history blame contribute delete

6.36 kB

	# 🚀 Antigravity Notebook - Quick Start Guide

	Get up and running with Antigravity Notebook in 5 minutes!

	## Step 1: Prerequisites

	Ensure you have:
	- ✅ Python 3.9 or higher
	- ✅ Docker & Docker Compose
	- ✅ CUDA GPU (recommended, 16GB+ VRAM) or CPU (slower)
	- ✅ ~20GB free disk space (for model + data)

	## Step 2: Installation

	### Clone & Install

	```bash
	# Clone the repository
	git clone <your-repo-url>
	cd antigravity-notebook

	# Install Python dependencies
	pip install -r requirements.txt
	```

	### Configure Environment

	```bash
	# Copy environment template
	cp .env.example .env

	# (Optional) Edit .env if you want custom settings
	# Default settings work out of the box!
	```

	## Step 3: Start PostgreSQL

	```bash
	# Start PostgreSQL with Docker Compose
	docker-compose up -d

	# Verify it's running
	docker ps
	```

	You should see a container named `antigravity_postgres` running.

	## Step 4: Initialize Database

	```bash
	# Create database tables
	python -m backend.database
	```

	You should see: `✅ Database initialized successfully!`

	## Step 5: Start Backend API

	```bash
	# Start the FastAPI backend
	python -m backend.main
	```

	Wait for:
	```
	✅ CLaRa model loaded!
	✅ Antigravity Notebook is ready!
	📍 API: http://0.0.0.0:8000
	📚 Docs: http://0.0.0.0:8000/docs
	```

	Note: First startup takes ~5 minutes to download CLaRa-7B model (14GB).

	## Step 6: Start Frontend UI

	Open a new terminal and run:

	```bash
	# Start Streamlit UI
	streamlit run frontend/app_notebook.py
	```

	Your browser should automatically open to `http://localhost:8501`

	## Step 7: Create Your First Notebook

	### In the Streamlit UI:

	1. Create a Notebook
	- Click "Create New Notebook" in the sidebar
	- Name it "My First Notebook"
	- Add a description (optional)
	- Click "Create Notebook"

	2. Add a Source
	- Choose one of:
	- PDF: Upload a PDF file
	- URL: Paste a webpage URL (e.g., Wikipedia article)
	- Text: Paste some text content

	Example URL to try: `https://en.wikipedia.org/wiki/Artificial_intelligence`

	3. Wait for Processing
	- The source will be compressed into latent tensors
	- PDF (50 pages): ~30 seconds
	- URL: ~20 seconds
	- Text: ~10 seconds

	4. Ask a Question
	- Type a query in the chat box
	- Example: "What are the main points discussed in this document?"
	- Press Enter
	- Wait ~10 seconds for response

	5. View Sources
	- Click "Sources" under the response to see which sources were cited
	- Check the Memory Usage gauge to see context utilization

	## 🎉 You're Done!

	You now have a working NotebookLM clone!

	## 📚 Next Steps

	### Try These Examples:

	#### Example 1: Research Assistant
	1. Create a notebook called "AI Research"
	2. Add these Wikipedia URLs:
	- https://en.wikipedia.org/wiki/Artificial_intelligence
	- https://en.wikipedia.org/wiki/Machine_learning
	- https://en.wikipedia.org/wiki/Deep_learning
	3. Ask: "Compare and contrast AI, ML, and Deep Learning"

	#### Example 2: Document Analysis
	1. Create a notebook called "Company Docs"
	2. Upload 3-5 PDF reports or documents
	3. Ask: "Summarize the key findings across all documents"

	#### Example 3: Web Research
	1. Create a notebook called "Topic Research"
	2. Add 5-10 URLs about a topic you're interested in
	3. Ask questions that require synthesizing information across sources

	## 🛠️ Common Issues

	### Issue: "Database connection failed"
	Solution: Ensure PostgreSQL is running
	```bash
	docker-compose up -d
	docker ps # Check if container is running
	```

	### Issue: "CUDA out of memory"
	Solution: Use CPU mode
	```bash
	# Edit .env
	DEVICE=cpu
	```

	### Issue: "Model download is slow"
	Solution: Be patient! CLaRa-7B is 14GB. It only downloads once.
	Check progress at: `./model_cache/`

	### Issue: "PDF extraction failed"
	Solution: Ensure PDF has extractable text (not scanned images)

	### Issue: "URL scraping failed"
	Solution: Some websites block scraping. Try a different URL.

	## 🔧 Configuration Tips

	### For CPU-Only Systems

	Edit `.env`:
	```env
	DEVICE=cpu
	```

	### For Limited Memory

	Reduce context window in `.env`:
	```env
	MAX_CONTEXT_TOKENS=16384 # Half the default
	```

	### For Production Use

	1. Change database password:
	```env
	POSTGRES_PASSWORD=<secure-password>
	```

	2. Set secret key:
	```env
	SECRET_KEY=<random-secure-key>
	```

	3. Configure CORS in `backend/main.py`:
	```python
	allow_origins=["https://your-frontend-domain.com"]
	```

	## 📊 Monitoring

	### Check API Health
	```bash
	curl http://localhost:8000/health
	```

	### View Storage Stats
	```bash
	curl http://localhost:8000/stats
	```

	### API Documentation
	Open: http://localhost:8000/docs

	Try the interactive API explorer!

	## 🧪 Testing the API Directly

	### Create a Notebook
	```bash
	curl -X POST http://localhost:8000/notebooks/ \
	-H "Content-Type: application/json" \
	-d '{"name": "Test Notebook", "description": "API test"}'
	```

	### List Notebooks
	```bash
	curl http://localhost:8000/notebooks/
	```

	### Add Text Source
	```bash
	curl -X POST http://localhost:8000/sources/notebooks/<NOTEBOOK_ID>/sources/text \
	-H "Content-Type: application/json" \
	-d '{
	"title": "Test Document",
	"content": "This is a test document with some content to analyze."
	}'
	```

	### Query Notebook
	```bash
	curl -X POST http://localhost:8000/chat/notebooks/<NOTEBOOK_ID>/chat \
	-H "Content-Type: application/json" \
	-d '{"query": "What is this document about?"}'
	```

	## 🎓 Learning More

	- Read the Plan: See `ANTIGRAVITY_PLAN.md` for architecture details
	- Explore the Code: Check out:
	- `backend/services/context_manager.py` - The "brain"
	- `backend/models/clara.py` - CLaRa wrapper
	- `backend/services/ingestion.py` - Multi-modal processing

	## 💡 Pro Tips

	1. Start Small: Begin with 1-2 sources to understand the system
	2. Check Memory Usage: Watch the gauge to see when you hit limits
	3. Use Descriptive Titles: Makes it easier to understand citations
	4. Mix Source Types: PDFs + URLs + Text work great together
	5. Ask Synthesis Questions: The AI excels at combining information

	## 🆘 Need Help?

	- Documentation: See `README.md`
	- API Docs: http://localhost:8000/docs
	- Issues: Open a GitHub issue
	- Logs: Check terminal output for debugging

	---

	Happy NotebookLM-ing! 🚀