JacekAI / .github /copilot-instructions.md
Jacek Zadrożny
Revert to OPENAI_API_KEY and switch to gpt-4o-mini
f2986d3

A newer version of the Gradio SDK is available: 6.6.0

Upgrade

Copilot Instructions for Jacek AI

This file provides guidance for GitHub Copilot when working with the Jacek AI codebase - a bilingual (Polish/English) accessibility chatbot using RAG with LanceDB and OpenAI GPT-4.

Build, Test, and Run Commands

Running the Application

# Local development - starts Gradio UI at http://127.0.0.1:7860
python app.py

# Run all startup tests before deployment
python test_startup.py

Environment Setup

# Install dependencies
pip install -r requirements.txt

# Configure environment (required before first run)
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

Database Management

# Compact LanceDB (removes version history, reduces file count)
python compact_database.py

# Check document count
python -c "import lancedb; db = lancedb.connect('./lancedb'); print(len(db.open_table('a11y_expert')))"

Testing

# Run full test suite (imports, config, vector store, embeddings, agent)
python test_startup.py

# All tests must pass before deploying to Hugging Face Spaces

Architecture Overview

Core Components

Agent System (agent/)

  • a11y_agent.py: Main A11yExpertAgent class with streaming responses via OpenAI
  • prompts.py: Language-specific system prompts (Polish/English) with strict language enforcement
  • tools.py: RAG tools for knowledge base search (top-5 semantic results)

Vector Store (database/)

  • vector_store_client.py: LanceDB client with lazy loading and automatic reconnection
  • Database path: ./lancedb/a11y_expert.lance (tracked with Git LFS)
  • READ-ONLY in production (Hugging Face Spaces environment)

Embeddings (models/)

  • embeddings.py: OpenAI embeddings client with disk caching (./cache/embeddings) and retry logic
  • Model: text-embedding-3-large (3072 dimensions)
  • Singleton pattern: use get_embeddings_client() for shared instance

UI (app.py)

  • Gradio ChatInterface with two-column layout (chat + notes from notes.md)
  • Lazy agent initialization - agent loads on first user query, not at startup
  • Streaming responses for better UX

Configuration (config.py)

  • Pydantic settings with environment variable support
  • All config loaded from .env file (never hardcode secrets)
  • Required: OPENAI_API_KEY (OpenAI API key for LLM and embeddings)

Data Flow (RAG Pipeline)

  1. User asks question in Gradio UI
  2. Language detected from query using langdetect (Polish or English)
  3. Query embedded using OpenAI embeddings API (with cache lookup)
  4. Vector search in LanceDB (filtered by language: where="language = 'pl'" or 'en')
  5. Top 5 results formatted as context
  6. Context + query + language-specific system prompt sent to GPT-4
  7. Response streamed back to UI token-by-token

Key Design Patterns

  • Lazy Initialization: Agent and database connections initialize on first use, not at startup (faster deployment)
  • Singleton Pattern: get_embeddings_client() returns shared instance across the app
  • Language Detection: Auto-detects query language and adjusts both prompt and vector search filter
  • Stateless Agent: No internal conversation history (Gradio handles history in UI)
  • Conversation Context: Last 4 messages kept in context for follow-up questions

Key Conventions

Language Handling - CRITICAL

The agent has strict language enforcement in system prompts:

  • Polish queries get SYSTEM_PROMPT_PL with "CRITICAL: Answer ONLY in Polish"
  • English queries get SYSTEM_PROMPT_EN with "CRITICAL: Answer ONLY in English"
  • System prompts explicitly instruct the LLM to translate sources if needed
  • Vector search is language-filtered: where="language = 'pl'" or where="language = 'en'"

When modifying prompts: Never remove or weaken the language enforcement instructions - they prevent language mixing which confuses users.

LanceDB Database - READ-ONLY in Production

  • Database at ./lancedb/ is tracked with Git LFS (not generated at runtime)
  • In Hugging Face Spaces: database is read-only (filesystem is immutable)
  • For local development: use VectorStoreClient.add_documents() to add data
  • After local changes: run compact_database.py to reduce file count before committing
  • Schema: text, vector, source, language, doc_type, created_at, updated_at

Configuration Loading

All settings in config.py are loaded from environment variables:

from config import get_settings

settings = get_settings()  # Singleton, cached
print(settings.llm_model)  # gpt-4o (default)

Never access environment variables directly - always use get_settings().

Hugging Face Spaces Deployment

Critical deployment requirements:

  1. demo.queue() must be called explicitly (see app.py:238-243)
  2. Do NOT use atexit.register() for cleanup (causes premature shutdown)
  3. LanceDB must be committed with Git LFS (database is read-only in HF)
  4. API key stored as HF Spaces Secret: OPENAI_API_KEY
  5. The if __name__ == "__main__" block handles both local and HF deployments

Testing before deployment:

python test_startup.py  # All tests must pass

Logging

Use loguru for all logging (already configured):

from loguru import logger

logger.info("Starting process...")
logger.success("✅ Completed successfully")
logger.error(f"❌ Failed: {error}")

Set LOG_LEVEL=DEBUG in .env for verbose output during development.

Error Handling

  • Always close resources in agent/client classes (implement close() method)
  • Use try/except with specific exception types
  • Log full traceback for debugging: logger.error(traceback.format_exc())
  • For user-facing errors, provide clear Polish/English messages depending on detected language

Project Structure

JacekAI/
├── agent/                   # Core agent logic
│   ├── a11y_agent.py       # Main agent with RAG
│   ├── prompts.py          # Language-specific prompts (PL/EN)
│   └── tools.py            # Knowledge base search tools
├── database/
│   └── vector_store_client.py  # LanceDB client
├── models/
│   └── embeddings.py       # OpenAI embeddings with caching
├── lancedb/                # Vector database (Git LFS)
│   └── a11y_expert.lance/
├── cache/                  # Embeddings cache (gitignored)
├── app.py                  # Gradio UI with lazy initialization
├── config.py               # Pydantic settings (environment variables)
├── test_startup.py         # Deployment readiness tests
├── compact_database.py     # Database compaction utility
├── requirements.txt        # Python dependencies
├── .env.example            # Environment template
└── notes.md               # Optional notes displayed in UI sidebar

Important Implementation Notes

When Adding New Features to Agent

  1. Modifying prompts → Edit agent/prompts.py
  2. Adding new tools → Add function to agent/tools.py
  3. Changing RAG logic → Modify agent/a11y_agent.py
  4. Test locally with python app.py and interact through UI

When Updating Dependencies

  1. Edit requirements.txt
  2. Run pip install -r requirements.txt
  3. Test with python test_startup.py
  4. Commit changes and test in HF Spaces

When Debugging

  • Set LOG_LEVEL=DEBUG in .env for verbose logging
  • Agent initialization happens on first query (check logs for "A11yExpertAgent initialized")
  • Embeddings cache is at ./cache/embeddings (create directory if missing)
  • Vector search logs show retrieved context from database

Common Pitfalls

  1. DO NOT modify the database in production (LanceDB is read-only on HF Spaces)
  2. DO NOT use atexit.register() in app.py (breaks HF Spaces deployment)
  3. DO NOT weaken language enforcement in prompts (causes confusing mixed-language responses)
  4. DO NOT access os.environ directly - always use get_settings()
  5. DO NOT initialize agent at module level - use lazy initialization pattern
  6. DO NOT forget to call demo.queue() before demo.launch() in Gradio

Environment Variables

Required in .env file:

  • OPENAI_API_KEY - OpenAI API key for LLM and embeddings - REQUIRED

Optional (with defaults):

  • LLM_MODEL - Language model (default: gpt-4o-mini)
  • LLM_BASE_URL - API endpoint (default: GitHub Models endpoint)
  • EMBEDDING_MODEL - Embedding model (default: text-embedding-3-large)
  • LANCEDB_URI - Database path (default: ./lancedb)
  • LANCEDB_TABLE - Table name (default: a11y_expert)
  • LOG_LEVEL - Logging verbosity (default: INFO)
  • SERVER_HOST - Gradio host (default: 127.0.0.1, use 0.0.0.0 for HF)
  • SERVER_PORT - Gradio port (default: 7860)

Related Documentation

  • CLAUDE.md - Detailed guidance for Claude Code (includes architectural details)
  • README.md - User-facing documentation with setup instructions
  • HF_SPACES_GUIDE.md - Hugging Face Spaces deployment guide
  • QUICK_REFERENCE.md - Quick reference for common tasks