| # Setup Guide | |
| ## Prerequisites | |
| ### Required Software | |
| - **Python**: 3.8 or higher | |
| - **pip**: Latest version (`python -m pip install --upgrade pip`) | |
| - **Git**: For version control (optional) | |
| ### Required API Keys | |
| #### 1. Hugging Face API Token | |
| **Purpose**: Image generation via SDXL-Lightning | |
| **Get your token:** | |
| 1. Create account at [huggingface.co](https://huggingface.co) | |
| 2. Go to Settings → Access Tokens | |
| 3. Create new token with "Read" permissions | |
| 4. Copy the token (starts with `hf_...`) | |
| #### 2. Smithsonian API Key | |
| **Purpose**: Museum artifact ingestion | |
| **Get your key:** | |
| 1. Visit [Smithsonian Open Access](https://api.si.edu/openaccess) | |
| 2. Request API key (free, instant approval) | |
| 3. Copy the API key from email | |
| --- | |
| ## Installation Steps | |
| ### 1. Clone or Download Project | |
| ```bash | |
| cd c:\Users\Administrador\cora | |
| ``` | |
| ### 2. Create Virtual Environment (Recommended) | |
| ```bash | |
| python -m venv venv | |
| # Activate | |
| # Windows | |
| venv\Scripts\activate | |
| # Linux/Mac | |
| source venv/bin/activate | |
| ``` | |
| ### 3. Install Dependencies | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| **Expected install time**: 5-10 minutes (includes PyTorch) | |
| ### 4. Configure Environment Variables | |
| Create a `.env` file in the project root: | |
| ```bash | |
| # .env | |
| HF_API_TOKEN=hf_your_hugging_face_token_here | |
| SI_API_KEY=your_smithsonian_key_here | |
| ``` | |
| **Important**: Never commit `.env` to version control! | |
| ### 5. Verify Installation | |
| ```bash | |
| python tests/verify_system.py | |
| ``` | |
| Expected output: | |
| ``` | |
| ✅ CoraVision initialized | |
| ✅ CoraMemory initialized | |
| ✅ HF_API_TOKEN found | |
| ✅ System ready | |
| ``` | |
| --- | |
| ## First Run | |
| ### Option A: Full UI (Testing) | |
| ```bash | |
| # Terminal 1: Start API | |
| python api.py | |
| # Wait for: "Uvicorn running on http://0.0.0.0:8000" | |
| # Terminal 2: Start UI | |
| python ui.py | |
| # Wait for: "Running on local URL: http://127.0.0.1:7861" | |
| # Open browser to http://127.0.0.1:7861 | |
| ``` | |
| ### Option B: Etymology API (Integration) | |
| ```bash | |
| python etymology_api.py | |
| # API ready at http://localhost:8000 | |
| ``` | |
| --- | |
| ## Populate Archive (Optional but Recommended) | |
| ### Load Museum Artifacts | |
| ```bash | |
| # Load Roman artifacts from Met Museum | |
| python loaders/met_loader.py | |
| # Load from Smithsonian | |
| python loaders/smithsonian_loader.py | |
| ``` | |
| **What this does:** | |
| - Downloads historical images from museum APIs | |
| - Generates CLIP embeddings | |
| - Indexes into ChromaDB (`./archive_db`) | |
| - Enables RAG fallback for generation failures | |
| **Time**: ~2-3 minutes per loader | |
| ### Custom Loading | |
| Create your own loader script: | |
| ```python | |
| from met_loader import MetLoader | |
| loader = MetLoader() | |
| loader.search_and_index("Viking weapons", limit=5) | |
| loader.search_and_index("Medieval manuscripts", limit=5) | |
| ``` | |
| --- | |
| ## Troubleshooting | |
| ### Issue: `ModuleNotFoundError` | |
| **Solution**: Ensure virtual environment is activated and dependencies installed | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### Issue: `HF_API_TOKEN not found` | |
| **Solution**: Check `.env` file exists in project root with correct token | |
| ### Issue: Port 8000 already in use | |
| **Solution**: Find and kill existing process | |
| ```bash | |
| # Windows | |
| netstat -ano | findstr :8000 | |
| taskkill /PID <PID> /F | |
| # Linux/Mac | |
| lsof -ti:8000 | xargs kill -9 | |
| ``` | |
| ### Issue: API returns 402 Payment Required | |
| **Solution**: This is expected with HF free tier. The RAG fallback will activate: | |
| 1. Ensure archive is populated (`python met_loader.py`) | |
| 2. System will automatically serve museum artifacts | |
| 3. No action needed from you | |
| ### Issue: ChromaDB errors | |
| **Solution**: Delete and recreate database | |
| ```bash | |
| rm -rf archive_db | |
| python | |
| >>> from cora_memory import CoraMemory | |
| >>> mem = CoraMemory() # Creates fresh DB | |
| ``` | |
| ### Issue: CUDA out of memory | |
| **Solution**: Vision models run on CPU by default. If you enabled GPU: | |
| ```python | |
| # In cora_vision.py, ensure: | |
| device = "cpu" # Not "cuda" | |
| ``` | |
| --- | |
| ## Directory Structure After Setup | |
| ``` | |
| cora/ | |
| ├── .env # Your API keys (DO NOT COMMIT) | |
| ├── .gitignore | |
| ├── requirements.txt | |
| │ | |
| ├── venv/ # Virtual environment (if created) | |
| │ | |
| ├── api.py | |
| ├── etymology_api.py | |
| ├── ui.py | |
| │ | |
| ├── cora_curator.py | |
| ├── cora_engine.py | |
| ├── cora_memory.py | |
| ├── cora_vision.py | |
| │ | |
| ├── loaders/ | |
| │ ├── smithsonian_loader.py | |
| │ └── met_loader.py | |
| │ | |
| ├── scripts/ | |
| │ └── load_roman_artifacts.py | |
| │ | |
| ├── tests/ | |
| │ ├── test_etymology_api.py | |
| │ ├── verify_system.py | |
| │ └── ... | |
| │ | |
| ├── archive_db/ # ChromaDB storage (auto-created) | |
| │ └── chroma.sqlite3 | |
| │ | |
| ├── archive_images/ # Downloaded museum artifacts | |
| │ ├── met_12345_abc.jpg | |
| │ └── si_67890_def.jpg | |
| │ | |
| ├── docs/ | |
| │ ├── README.md | |
| │ ├── ARCHITECTURE.md | |
| │ ├── SETUP.md (this file) | |
| │ └── README_ETYMOLOGY_API.md | |
| ``` | |
| --- | |
| ## Next Steps | |
| 1. **Test Generation**: Try the UI → "Generate" tab → Enter "Roman soldier" | |
| 2. **Test Archive**: UI → "Archive" tab → Search "romans" | |
| 3. **Test API**: Run `python tests/test_etymology_api.py` | |
| 4. **Integrate**: See `docs/README_ETYMOLOGY_API.md` for etymology app integration | |
| --- | |
| ## Environment Variables Reference | |
| | Variable | Required | Purpose | Example | | |
| |----------|----------|---------|---------| | |
| | `HF_API_TOKEN` | Yes | Hugging Face API access | `hf_abcd...xyz` | | |
| | `SI_API_KEY` | Optional* | Smithsonian data ingestion | `abc123...` | | |
| | `PORT` | No | Override API port (default 8000) | `8080` | | |
| *Required only for museum data ingestion, not for generation. | |
| --- | |
| ## Updating | |
| ```bash | |
| # Pull latest changes (if using Git) | |
| git pull | |
| # Update dependencies | |
| pip install -r requirements.txt --upgrade | |
| # Restart servers | |
| ``` | |
| --- | |
| ## Uninstall | |
| ```bash | |
| # Deactivate virtual environment | |
| deactivate | |
| # Remove project directory | |
| rm -rf c:\Users\Administrador\cora | |
| # Or just delete venv and cache | |
| rm -rf venv archive_db | |
| ``` | |