open-navigator / website /docs /quickstart.md
jcbowyer's picture
Clean HuggingFace deployment without binary files
61d29fc
---
sidebar_position: 3
displayed_sidebar: developersSidebar
---
# Quick Start Guide
## Installation
### Option 1: Automated Installation (Recommended)
Run the installation script:
```bash
chmod +x install.sh
./install.sh
```
This will:
- Create a virtual environment
- Install all dependencies
- Create .env file from template
- Set up the project structure
### Option 2: Manual Installation
```bash
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate # On Windows: venv\Scripts\activate
# Upgrade pip
pip install --upgrade pip
# Install dependencies
pip install -r requirements.txt
# Create .env file
cp .env.example .env
```
### Option 3: Using Makefile
```bash
make install
```
## Configuration
Edit the `.env` file and add your API keys:
```bash
# Required
OPENAI_API_KEY=your_openai_api_key_here
# For production (Databricks)
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=your_databricks_token_here
DATABRICKS_WAREHOUSE_ID=your_warehouse_id_here
# Optional: HuggingFace (for publishing datasets)
HUGGINGFACE_TOKEN=hf_your_write_token_here # Needs Write permissions
HF_ORGANIZATION=YourOrgName # Optional
```
## Running the System
### Start the API Server
```bash
# Using the virtual environment
source venv/bin/activate
python main.py serve
# Or using make
make run
```
Visit http://localhost:8000 for the API and http://localhost:8000/docs for interactive documentation.
### Run Example Workflow
```bash
# Activate venv first
source venv/bin/activate
# Run example
python examples/example_workflow.py
# Or using make
make example
```
### Generate Heatmap
```bash
# Activate venv first
source venv/bin/activate
# Generate heatmap
python main.py generate-heatmap --output heatmap.html
# Or using make
make heatmap
```
## Docker Deployment
```bash
# Start all services
make docker-up
# Stop all services
make docker-down
```
This starts:
- API server on http://localhost:8000
- Qdrant vector DB on http://localhost:6333
- Jupyter notebook on http://localhost:8888
## Common Commands
```bash
# Activate virtual environment (required for all commands)
source venv/bin/activate
# Start API server
python main.py serve
# Run with auto-reload (development)
python main.py serve --reload
# Check system status
python main.py status
# Run tests
pytest
# Or using make
make test
```
## Troubleshooting
### "ModuleNotFoundError: No module named 'click'"
You need to activate the virtual environment first:
```bash
source venv/bin/activate
```
### "Tesseract binary not found" or OCR errors
The `install.sh` script automatically installs tesseract-ocr on Linux (via apt) and macOS (via brew). If it failed or you're on a different system, install manually:
**Linux (Debian/Ubuntu):**
```bash
sudo apt-get update && sudo apt-get install -y tesseract-ocr
```
**macOS:**
```bash
brew install tesseract
```
**Verify installation:**
```bash
tesseract --version
```
OCR is optional but enables text extraction from scanned PDFs and images.
### "error: externally-managed-environment"
Don't use `pip install` directly. Use the virtual environment:
```bash
# Create venv if not exists
python3 -m venv venv
# Activate it
source venv/bin/activate
# Now install
pip install -r requirements.txt
```
### Permission denied when running install.sh
```bash
chmod +x install.sh
./install.sh
```
## Next Steps
1. Configure your `.env` file with API keys
2. Run the example workflow: `make example`
3. Start the API server: `make run`
4. Check out the interactive docs: http://localhost:8000/docs
5. Generate a heatmap: `make heatmap`
For more details, see the main [README.md](README.md).