# Setup Guide

## Prerequisites

### Required Software
- **Python**: 3.8 or higher
- **pip**: Latest version (`python -m pip install --upgrade pip`)
- **Git**: For version control (optional)

### Required API Keys

#### 1. Hugging Face API Token
**Purpose**: Image generation via SDXL-Lightning

**Get your token:**
1. Create account at [huggingface.co](https://huggingface.co)
2. Go to Settings → Access Tokens
3. Create new token with "Read" permissions
4. Copy the token (starts with `hf_...`)

#### 2. Smithsonian API Key
**Purpose**: Museum artifact ingestion

**Get your key:**
1. Visit [Smithsonian Open Access](https://api.si.edu/openaccess)
2. Request API key (free, instant approval)
3. Copy the API key from email

---

## Installation Steps

### 1. Clone or Download Project
```bash
cd c:\Users\Administrador\cora
```

### 2. Create Virtual Environment (Recommended)
```bash
python -m venv venv

# Activate
# Windows
venv\Scripts\activate

# Linux/Mac
source venv/bin/activate
```

### 3. Install Dependencies
```bash
pip install -r requirements.txt
```

**Expected install time**: 5-10 minutes (includes PyTorch)

### 4. Configure Environment Variables
Create a `.env` file in the project root:

```bash
# .env
HF_API_TOKEN=hf_your_hugging_face_token_here
SI_API_KEY=your_smithsonian_key_here
```

**Important**: Never commit `.env` to version control!

### 5. Verify Installation
```bash
python tests/verify_system.py
```

Expected output:
```
✅ CoraVision initialized
✅ CoraMemory initialized
✅ HF_API_TOKEN found
✅ System ready
```

---

## First Run

### Option A: Full UI (Testing)
```bash
# Terminal 1: Start API
python api.py
# Wait for: "Uvicorn running on http://0.0.0.0:8000"

# Terminal 2: Start UI
python ui.py
# Wait for: "Running on local URL: http://127.0.0.1:7861"

# Open browser to http://127.0.0.1:7861
```

### Option B: Etymology API (Integration)
```bash
python etymology_api.py
# API ready at http://localhost:8000
```

---

## Populate Archive (Optional but Recommended)

### Load Museum Artifacts
```bash
# Load Roman artifacts from Met Museum
python loaders/met_loader.py

# Load from Smithsonian
python loaders/smithsonian_loader.py
```

**What this does:**
- Downloads historical images from museum APIs
- Generates CLIP embeddings
- Indexes into ChromaDB (`./archive_db`)
- Enables RAG fallback for generation failures

**Time**: ~2-3 minutes per loader

### Custom Loading
Create your own loader script:

```python
from met_loader import MetLoader

loader = MetLoader()
loader.search_and_index("Viking weapons", limit=5)
loader.search_and_index("Medieval manuscripts", limit=5)
```

---

## Troubleshooting

### Issue: `ModuleNotFoundError`
**Solution**: Ensure virtual environment is activated and dependencies installed
```bash
pip install -r requirements.txt
```

### Issue: `HF_API_TOKEN not found`
**Solution**: Check `.env` file exists in project root with correct token

### Issue: Port 8000 already in use
**Solution**: Find and kill existing process
```bash
# Windows
netstat -ano | findstr :8000
taskkill /PID <PID> /F

# Linux/Mac
lsof -ti:8000 | xargs kill -9
```

### Issue: API returns 402 Payment Required
**Solution**: This is expected with HF free tier. The RAG fallback will activate:
1. Ensure archive is populated (`python met_loader.py`)
2. System will automatically serve museum artifacts
3. No action needed from you

### Issue: ChromaDB errors
**Solution**: Delete and recreate database
```bash
rm -rf archive_db
python
>>> from cora_memory import CoraMemory
>>> mem = CoraMemory()  # Creates fresh DB
```

### Issue: CUDA out of memory
**Solution**: Vision models run on CPU by default. If you enabled GPU:
```python
# In cora_vision.py, ensure:
device = "cpu"  # Not "cuda"
```

---

## Directory Structure After Setup

```
cora/
├── .env                    # Your API keys (DO NOT COMMIT)
├── .gitignore
├── requirements.txt
│
├── venv/                   # Virtual environment (if created)
│
├── api.py
├── etymology_api.py
├── ui.py
│
├── cora_curator.py
├── cora_engine.py
├── cora_memory.py
├── cora_vision.py
│
├── loaders/
│   ├── smithsonian_loader.py
│   └── met_loader.py
│
├── scripts/
│   └── load_roman_artifacts.py
│
├── tests/
│   ├── test_etymology_api.py
│   ├── verify_system.py
│   └── ...
│
├── archive_db/             # ChromaDB storage (auto-created)
│   └── chroma.sqlite3
│
├── archive_images/         # Downloaded museum artifacts
│   ├── met_12345_abc.jpg
│   └── si_67890_def.jpg
│
├── docs/
│   ├── README.md
│   ├── ARCHITECTURE.md
│   ├── SETUP.md (this file)
│   └── README_ETYMOLOGY_API.md
```

---

## Next Steps

1. **Test Generation**: Try the UI → "Generate" tab → Enter "Roman soldier"
2. **Test Archive**: UI → "Archive" tab → Search "romans"
3. **Test API**: Run `python tests/test_etymology_api.py`
4. **Integrate**: See `docs/README_ETYMOLOGY_API.md` for etymology app integration

---

## Environment Variables Reference

| Variable | Required | Purpose | Example |
|----------|----------|---------|---------|
| `HF_API_TOKEN` | Yes | Hugging Face API access | `hf_abcd...xyz` |
| `SI_API_KEY` | Optional* | Smithsonian data ingestion | `abc123...` |
| `PORT` | No | Override API port (default 8000) | `8080` |

*Required only for museum data ingestion, not for generation.

---

## Updating

```bash
# Pull latest changes (if using Git)
git pull

# Update dependencies
pip install -r requirements.txt --upgrade

# Restart servers
```

---

## Uninstall

```bash
# Deactivate virtual environment
deactivate

# Remove project directory
rm -rf c:\Users\Administrador\cora

# Or just delete venv and cache
rm -rf venv archive_db
```