# Copilot Instructions for Jacek AI

This file provides guidance for GitHub Copilot when working with the Jacek AI codebase - a bilingual (Polish/English) accessibility chatbot using RAG with LanceDB and OpenAI GPT-4.

## Build, Test, and Run Commands

### Running the Application
```bash
# Local development - starts Gradio UI at http://127.0.0.1:7860
python app.py

# Run all startup tests before deployment
python test_startup.py
```

### Environment Setup
```bash
# Install dependencies
pip install -r requirements.txt

# Configure environment (required before first run)
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
```

### Database Management
```bash
# Compact LanceDB (removes version history, reduces file count)
python compact_database.py

# Check document count
python -c "import lancedb; db = lancedb.connect('./lancedb'); print(len(db.open_table('a11y_expert')))"
```

### Testing
```bash
# Run full test suite (imports, config, vector store, embeddings, agent)
python test_startup.py

# All tests must pass before deploying to Hugging Face Spaces
```

## Architecture Overview

### Core Components

**Agent System** (`agent/`)
- `a11y_agent.py`: Main `A11yExpertAgent` class with streaming responses via OpenAI
- `prompts.py`: Language-specific system prompts (Polish/English) with **strict language enforcement**
- `tools.py`: RAG tools for knowledge base search (top-5 semantic results)

**Vector Store** (`database/`)
- `vector_store_client.py`: LanceDB client with lazy loading and automatic reconnection
- Database path: `./lancedb/a11y_expert.lance` (tracked with Git LFS)
- **READ-ONLY in production** (Hugging Face Spaces environment)

**Embeddings** (`models/`)
- `embeddings.py`: OpenAI embeddings client with disk caching (`./cache/embeddings`) and retry logic
- Model: `text-embedding-3-large` (3072 dimensions)
- Singleton pattern: use `get_embeddings_client()` for shared instance

**UI** (`app.py`)
- Gradio ChatInterface with two-column layout (chat + notes from `notes.md`)
- **Lazy agent initialization** - agent loads on first user query, not at startup
- Streaming responses for better UX

**Configuration** (`config.py`)
- Pydantic settings with environment variable support
- All config loaded from `.env` file (never hardcode secrets)
- Required: `OPENAI_API_KEY` (OpenAI API key for LLM and embeddings)

### Data Flow (RAG Pipeline)

1. User asks question in Gradio UI
2. Language detected from query using `langdetect` (Polish or English)
3. Query embedded using OpenAI embeddings API (with cache lookup)
4. Vector search in LanceDB (filtered by language: `where="language = 'pl'"` or `'en'`)
5. Top 5 results formatted as context
6. Context + query + language-specific system prompt sent to GPT-4
7. Response streamed back to UI token-by-token

### Key Design Patterns

- **Lazy Initialization**: Agent and database connections initialize on first use, not at startup (faster deployment)
- **Singleton Pattern**: `get_embeddings_client()` returns shared instance across the app
- **Language Detection**: Auto-detects query language and adjusts both prompt and vector search filter
- **Stateless Agent**: No internal conversation history (Gradio handles history in UI)
- **Conversation Context**: Last 4 messages kept in context for follow-up questions

## Key Conventions

### Language Handling - CRITICAL

The agent has **strict language enforcement** in system prompts:
- Polish queries get `SYSTEM_PROMPT_PL` with "CRITICAL: Answer ONLY in Polish"
- English queries get `SYSTEM_PROMPT_EN` with "CRITICAL: Answer ONLY in English"
- System prompts explicitly instruct the LLM to translate sources if needed
- Vector search is language-filtered: `where="language = 'pl'"` or `where="language = 'en'"`

**When modifying prompts**: Never remove or weaken the language enforcement instructions - they prevent language mixing which confuses users.

### LanceDB Database - READ-ONLY in Production

- Database at `./lancedb/` is tracked with Git LFS (not generated at runtime)
- In Hugging Face Spaces: database is read-only (filesystem is immutable)
- For local development: use `VectorStoreClient.add_documents()` to add data
- After local changes: run `compact_database.py` to reduce file count before committing
- Schema: `text`, `vector`, `source`, `language`, `doc_type`, `created_at`, `updated_at`

### Configuration Loading

All settings in `config.py` are loaded from environment variables:
```python
from config import get_settings

settings = get_settings()  # Singleton, cached
print(settings.llm_model)  # gpt-4o (default)
```

Never access environment variables directly - always use `get_settings()`.

### Hugging Face Spaces Deployment

**Critical deployment requirements**:
1. `demo.queue()` must be called explicitly (see `app.py:238-243`)
2. Do **NOT** use `atexit.register()` for cleanup (causes premature shutdown)
3. LanceDB must be committed with Git LFS (database is read-only in HF)
4. API key stored as HF Spaces Secret: `OPENAI_API_KEY`
5. The `if __name__ == "__main__"` block handles both local and HF deployments

**Testing before deployment**:
```bash
python test_startup.py  # All tests must pass
```

### Logging

Use loguru for all logging (already configured):
```python
from loguru import logger

logger.info("Starting process...")
logger.success("✅ Completed successfully")
logger.error(f"❌ Failed: {error}")
```

Set `LOG_LEVEL=DEBUG` in `.env` for verbose output during development.

### Error Handling

- Always close resources in agent/client classes (implement `close()` method)
- Use try/except with specific exception types
- Log full traceback for debugging: `logger.error(traceback.format_exc())`
- For user-facing errors, provide clear Polish/English messages depending on detected language

## Project Structure

```
JacekAI/
├── agent/                   # Core agent logic
│   ├── a11y_agent.py       # Main agent with RAG
│   ├── prompts.py          # Language-specific prompts (PL/EN)
│   └── tools.py            # Knowledge base search tools
├── database/
│   └── vector_store_client.py  # LanceDB client
├── models/
│   └── embeddings.py       # OpenAI embeddings with caching
├── lancedb/                # Vector database (Git LFS)
│   └── a11y_expert.lance/
├── cache/                  # Embeddings cache (gitignored)
├── app.py                  # Gradio UI with lazy initialization
├── config.py               # Pydantic settings (environment variables)
├── test_startup.py         # Deployment readiness tests
├── compact_database.py     # Database compaction utility
├── requirements.txt        # Python dependencies
├── .env.example            # Environment template
└── notes.md               # Optional notes displayed in UI sidebar
```

## Important Implementation Notes

### When Adding New Features to Agent

1. Modifying prompts → Edit `agent/prompts.py`
2. Adding new tools → Add function to `agent/tools.py`
3. Changing RAG logic → Modify `agent/a11y_agent.py`
4. Test locally with `python app.py` and interact through UI

### When Updating Dependencies

1. Edit `requirements.txt`
2. Run `pip install -r requirements.txt`
3. Test with `python test_startup.py`
4. Commit changes and test in HF Spaces

### When Debugging

- Set `LOG_LEVEL=DEBUG` in `.env` for verbose logging
- Agent initialization happens on first query (check logs for "A11yExpertAgent initialized")
- Embeddings cache is at `./cache/embeddings` (create directory if missing)
- Vector search logs show retrieved context from database

## Common Pitfalls

1. **DO NOT** modify the database in production (LanceDB is read-only on HF Spaces)
2. **DO NOT** use `atexit.register()` in `app.py` (breaks HF Spaces deployment)
3. **DO NOT** weaken language enforcement in prompts (causes confusing mixed-language responses)
4. **DO NOT** access `os.environ` directly - always use `get_settings()`
5. **DO NOT** initialize agent at module level - use lazy initialization pattern
6. **DO NOT** forget to call `demo.queue()` before `demo.launch()` in Gradio

## Environment Variables

Required in `.env` file:
- `OPENAI_API_KEY` - OpenAI API key for LLM and embeddings - **REQUIRED**

Optional (with defaults):
- `LLM_MODEL` - Language model (default: `gpt-4o-mini`)
- `LLM_BASE_URL` - API endpoint (default: GitHub Models endpoint)
- `EMBEDDING_MODEL` - Embedding model (default: `text-embedding-3-large`)
- `LANCEDB_URI` - Database path (default: `./lancedb`)
- `LANCEDB_TABLE` - Table name (default: `a11y_expert`)
- `LOG_LEVEL` - Logging verbosity (default: `INFO`)
- `SERVER_HOST` - Gradio host (default: `127.0.0.1`, use `0.0.0.0` for HF)
- `SERVER_PORT` - Gradio port (default: `7860`)

## Related Documentation

- `CLAUDE.md` - Detailed guidance for Claude Code (includes architectural details)
- `README.md` - User-facing documentation with setup instructions
- `HF_SPACES_GUIDE.md` - Hugging Face Spaces deployment guide
- `QUICK_REFERENCE.md` - Quick reference for common tasks