# Setup Guide ## Prerequisites ### Required Software - **Python**: 3.8 or higher - **pip**: Latest version (`python -m pip install --upgrade pip`) - **Git**: For version control (optional) ### Required API Keys #### 1. Hugging Face API Token **Purpose**: Image generation via SDXL-Lightning **Get your token:** 1. Create account at [huggingface.co](https://huggingface.co) 2. Go to Settings → Access Tokens 3. Create new token with "Read" permissions 4. Copy the token (starts with `hf_...`) #### 2. Smithsonian API Key **Purpose**: Museum artifact ingestion **Get your key:** 1. Visit [Smithsonian Open Access](https://api.si.edu/openaccess) 2. Request API key (free, instant approval) 3. Copy the API key from email --- ## Installation Steps ### 1. Clone or Download Project ```bash cd c:\Users\Administrador\cora ``` ### 2. Create Virtual Environment (Recommended) ```bash python -m venv venv # Activate # Windows venv\Scripts\activate # Linux/Mac source venv/bin/activate ``` ### 3. Install Dependencies ```bash pip install -r requirements.txt ``` **Expected install time**: 5-10 minutes (includes PyTorch) ### 4. Configure Environment Variables Create a `.env` file in the project root: ```bash # .env HF_API_TOKEN=hf_your_hugging_face_token_here SI_API_KEY=your_smithsonian_key_here ``` **Important**: Never commit `.env` to version control! ### 5. Verify Installation ```bash python tests/verify_system.py ``` Expected output: ``` ✅ CoraVision initialized ✅ CoraMemory initialized ✅ HF_API_TOKEN found ✅ System ready ``` --- ## First Run ### Option A: Full UI (Testing) ```bash # Terminal 1: Start API python api.py # Wait for: "Uvicorn running on http://0.0.0.0:8000" # Terminal 2: Start UI python ui.py # Wait for: "Running on local URL: http://127.0.0.1:7861" # Open browser to http://127.0.0.1:7861 ``` ### Option B: Etymology API (Integration) ```bash python etymology_api.py # API ready at http://localhost:8000 ``` --- ## Populate Archive (Optional but Recommended) ### Load Museum Artifacts ```bash # Load Roman artifacts from Met Museum python loaders/met_loader.py # Load from Smithsonian python loaders/smithsonian_loader.py ``` **What this does:** - Downloads historical images from museum APIs - Generates CLIP embeddings - Indexes into ChromaDB (`./archive_db`) - Enables RAG fallback for generation failures **Time**: ~2-3 minutes per loader ### Custom Loading Create your own loader script: ```python from met_loader import MetLoader loader = MetLoader() loader.search_and_index("Viking weapons", limit=5) loader.search_and_index("Medieval manuscripts", limit=5) ``` --- ## Troubleshooting ### Issue: `ModuleNotFoundError` **Solution**: Ensure virtual environment is activated and dependencies installed ```bash pip install -r requirements.txt ``` ### Issue: `HF_API_TOKEN not found` **Solution**: Check `.env` file exists in project root with correct token ### Issue: Port 8000 already in use **Solution**: Find and kill existing process ```bash # Windows netstat -ano | findstr :8000 taskkill /PID /F # Linux/Mac lsof -ti:8000 | xargs kill -9 ``` ### Issue: API returns 402 Payment Required **Solution**: This is expected with HF free tier. The RAG fallback will activate: 1. Ensure archive is populated (`python met_loader.py`) 2. System will automatically serve museum artifacts 3. No action needed from you ### Issue: ChromaDB errors **Solution**: Delete and recreate database ```bash rm -rf archive_db python >>> from cora_memory import CoraMemory >>> mem = CoraMemory() # Creates fresh DB ``` ### Issue: CUDA out of memory **Solution**: Vision models run on CPU by default. If you enabled GPU: ```python # In cora_vision.py, ensure: device = "cpu" # Not "cuda" ``` --- ## Directory Structure After Setup ``` cora/ ├── .env # Your API keys (DO NOT COMMIT) ├── .gitignore ├── requirements.txt │ ├── venv/ # Virtual environment (if created) │ ├── api.py ├── etymology_api.py ├── ui.py │ ├── cora_curator.py ├── cora_engine.py ├── cora_memory.py ├── cora_vision.py │ ├── loaders/ │ ├── smithsonian_loader.py │ └── met_loader.py │ ├── scripts/ │ └── load_roman_artifacts.py │ ├── tests/ │ ├── test_etymology_api.py │ ├── verify_system.py │ └── ... │ ├── archive_db/ # ChromaDB storage (auto-created) │ └── chroma.sqlite3 │ ├── archive_images/ # Downloaded museum artifacts │ ├── met_12345_abc.jpg │ └── si_67890_def.jpg │ ├── docs/ │ ├── README.md │ ├── ARCHITECTURE.md │ ├── SETUP.md (this file) │ └── README_ETYMOLOGY_API.md ``` --- ## Next Steps 1. **Test Generation**: Try the UI → "Generate" tab → Enter "Roman soldier" 2. **Test Archive**: UI → "Archive" tab → Search "romans" 3. **Test API**: Run `python tests/test_etymology_api.py` 4. **Integrate**: See `docs/README_ETYMOLOGY_API.md` for etymology app integration --- ## Environment Variables Reference | Variable | Required | Purpose | Example | |----------|----------|---------|---------| | `HF_API_TOKEN` | Yes | Hugging Face API access | `hf_abcd...xyz` | | `SI_API_KEY` | Optional* | Smithsonian data ingestion | `abc123...` | | `PORT` | No | Override API port (default 8000) | `8080` | *Required only for museum data ingestion, not for generation. --- ## Updating ```bash # Pull latest changes (if using Git) git pull # Update dependencies pip install -r requirements.txt --upgrade # Restart servers ``` --- ## Uninstall ```bash # Deactivate virtual environment deactivate # Remove project directory rm -rf c:\Users\Administrador\cora # Or just delete venv and cache rm -rf venv archive_db ```