Spaces:

jaczad
/

JacekAI

Sleeping

Jacek Zadrożny commited on Feb 4

Commit

f2986d3

1 Parent(s): deaaf9d

Revert to OPENAI_API_KEY and switch to gpt-4o-mini

Changes:
- Changed configuration from GITHUB_TOKEN back to OPENAI_API_KEY
- Switched LLM model from gpt-4o to gpt-4o-mini (15x cheaper, faster)
- Updated all code references in config.py, agent, models, and tests
- Added .github/copilot-instructions.md for better AI assistance
- Updated .gitignore to exclude node_modules and npm files

Files changed (7) hide show

.env.example +3 -3
.github/copilot-instructions.md +232 -0
.gitignore +4 -0
agent/a11y_agent.py +4 -3
config.py +11 -11
models/embeddings.py +1 -1
test_startup.py +3 -3

.env.example CHANGED Viewed

@@ -1,8 +1,8 @@
-# GitHub Token Configuration (Required)
-GITHUB_TOKEN=your_token_here
 # LLM Configuration
-LLM_MODEL=gpt-4o
 LLM_BASE_URL=https://api.openai.com/v1
 # Embeddings Configuration

+# OpenAI API Configuration (Required)
+OPENAI_API_KEY=your_api_key_here
 # LLM Configuration
+LLM_MODEL=gpt-4o-mini
 LLM_BASE_URL=https://api.openai.com/v1
 # Embeddings Configuration

.github/copilot-instructions.md ADDED Viewed

	@@ -0,0 +1,232 @@

+# Copilot Instructions for Jacek AI
+This file provides guidance for GitHub Copilot when working with the Jacek AI codebase - a bilingual (Polish/English) accessibility chatbot using RAG with LanceDB and OpenAI GPT-4.
+## Build, Test, and Run Commands
+### Running the Application
+```bash
+# Local development - starts Gradio UI at http://127.0.0.1:7860
+python app.py
+# Run all startup tests before deployment
+python test_startup.py
+```
+### Environment Setup
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Configure environment (required before first run)
+cp .env.example .env
+# Edit .env and add your OPENAI_API_KEY
+```
+### Database Management
+```bash
+# Compact LanceDB (removes version history, reduces file count)
+python compact_database.py
+# Check document count
+python -c "import lancedb; db = lancedb.connect('./lancedb'); print(len(db.open_table('a11y_expert')))"
+```
+### Testing
+```bash
+# Run full test suite (imports, config, vector store, embeddings, agent)
+python test_startup.py
+# All tests must pass before deploying to Hugging Face Spaces
+```
+## Architecture Overview
+### Core Components
+**Agent System** (`agent/`)
+- `a11y_agent.py`: Main `A11yExpertAgent` class with streaming responses via OpenAI
+- `prompts.py`: Language-specific system prompts (Polish/English) with **strict language enforcement**
+- `tools.py`: RAG tools for knowledge base search (top-5 semantic results)
+**Vector Store** (`database/`)
+- `vector_store_client.py`: LanceDB client with lazy loading and automatic reconnection
+- Database path: `./lancedb/a11y_expert.lance` (tracked with Git LFS)
+- **READ-ONLY in production** (Hugging Face Spaces environment)
+**Embeddings** (`models/`)
+- `embeddings.py`: OpenAI embeddings client with disk caching (`./cache/embeddings`) and retry logic
+- Model: `text-embedding-3-large` (3072 dimensions)
+- Singleton pattern: use `get_embeddings_client()` for shared instance
+**UI** (`app.py`)
+- Gradio ChatInterface with two-column layout (chat + notes from `notes.md`)
+- **Lazy agent initialization** - agent loads on first user query, not at startup
+- Streaming responses for better UX
+**Configuration** (`config.py`)
+- Pydantic settings with environment variable support
+- All config loaded from `.env` file (never hardcode secrets)
+- Required: `OPENAI_API_KEY` (OpenAI API key for LLM and embeddings)
+### Data Flow (RAG Pipeline)
+1. User asks question in Gradio UI
+2. Language detected from query using `langdetect` (Polish or English)
+3. Query embedded using OpenAI embeddings API (with cache lookup)
+4. Vector search in LanceDB (filtered by language: `where="language = 'pl'"` or `'en'`)
+5. Top 5 results formatted as context
+6. Context + query + language-specific system prompt sent to GPT-4
+7. Response streamed back to UI token-by-token
+### Key Design Patterns
+- **Lazy Initialization**: Agent and database connections initialize on first use, not at startup (faster deployment)
+- **Singleton Pattern**: `get_embeddings_client()` returns shared instance across the app
+- **Language Detection**: Auto-detects query language and adjusts both prompt and vector search filter
+- **Stateless Agent**: No internal conversation history (Gradio handles history in UI)
+- **Conversation Context**: Last 4 messages kept in context for follow-up questions
+## Key Conventions
+### Language Handling - CRITICAL
+The agent has **strict language enforcement** in system prompts:
+- Polish queries get `SYSTEM_PROMPT_PL` with "CRITICAL: Answer ONLY in Polish"
+- English queries get `SYSTEM_PROMPT_EN` with "CRITICAL: Answer ONLY in English"
+- System prompts explicitly instruct the LLM to translate sources if needed
+- Vector search is language-filtered: `where="language = 'pl'"` or `where="language = 'en'"`
+**When modifying prompts**: Never remove or weaken the language enforcement instructions - they prevent language mixing which confuses users.
+### LanceDB Database - READ-ONLY in Production
+- Database at `./lancedb/` is tracked with Git LFS (not generated at runtime)
+- In Hugging Face Spaces: database is read-only (filesystem is immutable)
+- For local development: use `VectorStoreClient.add_documents()` to add data
+- After local changes: run `compact_database.py` to reduce file count before committing
+- Schema: `text`, `vector`, `source`, `language`, `doc_type`, `created_at`, `updated_at`
+### Configuration Loading
+All settings in `config.py` are loaded from environment variables:
+```python
+from config import get_settings
+settings = get_settings()  # Singleton, cached
+print(settings.llm_model)  # gpt-4o (default)
+```
+Never access environment variables directly - always use `get_settings()`.
+### Hugging Face Spaces Deployment
+**Critical deployment requirements**:
+1. `demo.queue()` must be called explicitly (see `app.py:238-243`)
+2. Do **NOT** use `atexit.register()` for cleanup (causes premature shutdown)
+3. LanceDB must be committed with Git LFS (database is read-only in HF)
+4. API key stored as HF Spaces Secret: `OPENAI_API_KEY`
+5. The `if __name__ == "__main__"` block handles both local and HF deployments
+**Testing before deployment**:
+```bash
+python test_startup.py  # All tests must pass
+```
+### Logging
+Use loguru for all logging (already configured):
+```python
+from loguru import logger
+logger.info("Starting process...")
+logger.success("✅ Completed successfully")
+logger.error(f"❌ Failed: {error}")
+```
+Set `LOG_LEVEL=DEBUG` in `.env` for verbose output during development.
+### Error Handling
+- Always close resources in agent/client classes (implement `close()` method)
+- Use try/except with specific exception types
+- Log full traceback for debugging: `logger.error(traceback.format_exc())`
+- For user-facing errors, provide clear Polish/English messages depending on detected language
+## Project Structure
+```
+JacekAI/
+├── agent/                   # Core agent logic
+│   ├── a11y_agent.py       # Main agent with RAG
+│   ├── prompts.py          # Language-specific prompts (PL/EN)
+│   └── tools.py            # Knowledge base search tools
+├── database/
+│   └── vector_store_client.py  # LanceDB client
+├── models/
+│   └── embeddings.py       # OpenAI embeddings with caching
+├── lancedb/                # Vector database (Git LFS)
+│   └── a11y_expert.lance/
+├── cache/                  # Embeddings cache (gitignored)
+├── app.py                  # Gradio UI with lazy initialization
+├── config.py               # Pydantic settings (environment variables)
+├── test_startup.py         # Deployment readiness tests
+├── compact_database.py     # Database compaction utility
+├── requirements.txt        # Python dependencies
+├── .env.example            # Environment template
+└── notes.md               # Optional notes displayed in UI sidebar
+```
+## Important Implementation Notes
+### When Adding New Features to Agent
+1. Modifying prompts → Edit `agent/prompts.py`
+2. Adding new tools → Add function to `agent/tools.py`
+3. Changing RAG logic → Modify `agent/a11y_agent.py`
+4. Test locally with `python app.py` and interact through UI
+### When Updating Dependencies
+1. Edit `requirements.txt`
+2. Run `pip install -r requirements.txt`
+3. Test with `python test_startup.py`
+4. Commit changes and test in HF Spaces
+### When Debugging
+- Set `LOG_LEVEL=DEBUG` in `.env` for verbose logging
+- Agent initialization happens on first query (check logs for "A11yExpertAgent initialized")
+- Embeddings cache is at `./cache/embeddings` (create directory if missing)
+- Vector search logs show retrieved context from database
+## Common Pitfalls
+1. **DO NOT** modify the database in production (LanceDB is read-only on HF Spaces)
+2. **DO NOT** use `atexit.register()` in `app.py` (breaks HF Spaces deployment)
+3. **DO NOT** weaken language enforcement in prompts (causes confusing mixed-language responses)
+4. **DO NOT** access `os.environ` directly - always use `get_settings()`
+5. **DO NOT** initialize agent at module level - use lazy initialization pattern
+6. **DO NOT** forget to call `demo.queue()` before `demo.launch()` in Gradio
+## Environment Variables
+Required in `.env` file:
+- `OPENAI_API_KEY` - OpenAI API key for LLM and embeddings - **REQUIRED**
+Optional (with defaults):
+- `LLM_MODEL` - Language model (default: `gpt-4o-mini`)
+- `LLM_BASE_URL` - API endpoint (default: GitHub Models endpoint)
+- `EMBEDDING_MODEL` - Embedding model (default: `text-embedding-3-large`)
+- `LANCEDB_URI` - Database path (default: `./lancedb`)
+- `LANCEDB_TABLE` - Table name (default: `a11y_expert`)
+- `LOG_LEVEL` - Logging verbosity (default: `INFO`)
+- `SERVER_HOST` - Gradio host (default: `127.0.0.1`, use `0.0.0.0` for HF)
+- `SERVER_PORT` - Gradio port (default: `7860`)
+## Related Documentation
+- `CLAUDE.md` - Detailed guidance for Claude Code (includes architectural details)
+- `README.md` - User-facing documentation with setup instructions
+- `HF_SPACES_GUIDE.md` - Hugging Face Spaces deployment guide
+- `QUICK_REFERENCE.md` - Quick reference for common tasks

.gitignore CHANGED Viewed

@@ -44,6 +44,10 @@ qa_dataset.jsonl
 .env
 .env.local
 # OS
 .DS_Store
 Thumbs.db"

 .env
 .env.local
+# Node.js (GitHub Copilot CLI)
+node_modules/
+package-lock.json
 # OS
 .DS_Store
 Thumbs.db"

agent/a11y_agent.py CHANGED Viewed

@@ -255,13 +255,14 @@ def create_agent(language: Optional[str] = None) -> A11yExpertAgent:
     # Create vector store with lazy connection (no DB access yet)
     logger.info("Initializing vector store client...")
     vector_store = VectorStoreClient(uri=settings.lancedb_uri)
-    github_token = settings.github_token
     logger.info("Initializing OpenAI client...")
-    client_args = {"api_key": github_token}
     if settings.llm_base_url:
         client_args["base_url"] = settings.llm_base_url
     llm_client = OpenAI(**client_args)
     logger.info("Creating A11yExpertAgent instance...")

     # Create vector store with lazy connection (no DB access yet)
     logger.info("Initializing vector store client...")
     vector_store = VectorStoreClient(uri=settings.lancedb_uri)
+    api_key = settings.openai_api_key
     logger.info("Initializing OpenAI client...")
+    client_args = {"api_key": api_key}
     if settings.llm_base_url:
         client_args["base_url"] = settings.llm_base_url
     llm_client = OpenAI(**client_args)
     logger.info("Creating A11yExpertAgent instance...")

config.py CHANGED Viewed

@@ -16,11 +16,11 @@ class Settings(BaseSettings):
     """
     Application settings loaded from environment variables or .env file.
-    All settings have sensible defaults except for the GitHub token,
-    which must be provided via the GITHUB_TOKEN environment variable.
     Attributes:
-        github_token: GitHub token - used as API key for OpenAI-compatible endpoints (required)
         llm_model: Language model to use for chat completions
         llm_base_url: Base URL for OpenAI API (supports GitHub Models)
         embedding_model: Model to use for text embeddings
@@ -39,15 +39,15 @@ class Settings(BaseSettings):
     """
     # API Configuration (required)
-    github_token: str = Field(
         default="",
-        description="GitHub token - required for LLM and embeddings",
-        validation_alias="GITHUB_TOKEN"
     )
     # LLM Configuration
     llm_model: str = Field(
-        default="gpt-4o",
         description="Language model for chat completions"
     )
     llm_base_url: Optional[str] = Field(
@@ -107,17 +107,17 @@ class Settings(BaseSettings):
         description="Public URL for social media sharing"
     )
-    @field_validator("github_token")
     @classmethod
-    def validate_github_token(cls, v):
-        """Ensure GitHub token is provided and not empty."""
         v = v or ""
         v = v.strip()
         if not v:
             import os
             if not os.getenv("SPACE_ID"):
                 raise ValueError(
-                    "GITHUB_TOKEN is required. "
                     "Set it in your .env file or environment variables."
                 )
         return v

     """
     Application settings loaded from environment variables or .env file.
+    All settings have sensible defaults except for the OpenAI API key,
+    which must be provided via the OPENAI_API_KEY environment variable.
     Attributes:
+        openai_api_key: OpenAI API key (required)
         llm_model: Language model to use for chat completions
         llm_base_url: Base URL for OpenAI API (supports GitHub Models)
         embedding_model: Model to use for text embeddings
     """
     # API Configuration (required)
+    openai_api_key: str = Field(
         default="",
+        description="OpenAI API key - required for LLM and embeddings",
+        validation_alias="OPENAI_API_KEY"
     )
     # LLM Configuration
     llm_model: str = Field(
+        default="gpt-4o-mini",
         description="Language model for chat completions"
     )
     llm_base_url: Optional[str] = Field(
         description="Public URL for social media sharing"
     )
+    @field_validator("openai_api_key")
     @classmethod
+    def validate_api_key(cls, v):
+        """Ensure API key is provided and not empty."""
         v = v or ""
         v = v.strip()
         if not v:
             import os
             if not os.getenv("SPACE_ID"):
                 raise ValueError(
+                    "OPENAI_API_KEY is required. "
                     "Set it in your .env file or environment variables."
                 )
         return v

models/embeddings.py CHANGED Viewed

@@ -60,7 +60,7 @@ class EmbeddingsClient:
         logger.info(f"Initializing EmbeddingsClient with {self.settings.llm_base_url}")
         self.client = OpenAI(
-            api_key=self.settings.github_token,
             base_url=self.settings.llm_base_url
         )

         logger.info(f"Initializing EmbeddingsClient with {self.settings.llm_base_url}")
         self.client = OpenAI(
+            api_key=self.settings.openai_api_key,
             base_url=self.settings.llm_base_url
         )

test_startup.py CHANGED Viewed

@@ -32,7 +32,7 @@ def test_config():
         from config import get_settings
         import os
-        os.environ.setdefault("GITHUB_TOKEN", "test-token-for-validation")
         settings = get_settings()
         logger.info(f"LLM Model: {settings.llm_model}")
@@ -52,7 +52,7 @@ def test_vector_store():
         from database.vector_store_client import VectorStoreClient
         import os
-        os.environ.setdefault("GITHUB_TOKEN", "test-token-for-validation")
         settings = get_settings()
         client = VectorStoreClient(uri=settings.lancedb_uri)
@@ -90,7 +90,7 @@ def test_agent():
         from agent.a11y_agent import create_agent
         import os
-        os.environ.setdefault("GITHUB_TOKEN", "test-token-for-validation")
         agent = create_agent()
         logger.info(f"Agent language: {agent.language}")

         from config import get_settings
         import os
+        os.environ.setdefault("OPENAI_API_KEY", "test-key-for-validation")
         settings = get_settings()
         logger.info(f"LLM Model: {settings.llm_model}")
         from database.vector_store_client import VectorStoreClient
         import os
+        os.environ.setdefault("OPENAI_API_KEY", "test-key-for-validation")
         settings = get_settings()
         client = VectorStoreClient(uri=settings.lancedb_uri)
         from agent.a11y_agent import create_agent
         import os
+        os.environ.setdefault("OPENAI_API_KEY", "test-key-for-validation")
         agent = create_agent()
         logger.info(f"Agent language: {agent.language}")