Jacek Zadrożny commited on
Commit
f2986d3
·
1 Parent(s): deaaf9d

Revert to OPENAI_API_KEY and switch to gpt-4o-mini

Browse files

Changes:
- Changed configuration from GITHUB_TOKEN back to OPENAI_API_KEY
- Switched LLM model from gpt-4o to gpt-4o-mini (15x cheaper, faster)
- Updated all code references in config.py, agent, models, and tests
- Added .github/copilot-instructions.md for better AI assistance
- Updated .gitignore to exclude node_modules and npm files

.env.example CHANGED
@@ -1,8 +1,8 @@
1
- # GitHub Token Configuration (Required)
2
- GITHUB_TOKEN=your_token_here
3
 
4
  # LLM Configuration
5
- LLM_MODEL=gpt-4o
6
  LLM_BASE_URL=https://api.openai.com/v1
7
 
8
  # Embeddings Configuration
 
1
+ # OpenAI API Configuration (Required)
2
+ OPENAI_API_KEY=your_api_key_here
3
 
4
  # LLM Configuration
5
+ LLM_MODEL=gpt-4o-mini
6
  LLM_BASE_URL=https://api.openai.com/v1
7
 
8
  # Embeddings Configuration
.github/copilot-instructions.md ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copilot Instructions for Jacek AI
2
+
3
+ This file provides guidance for GitHub Copilot when working with the Jacek AI codebase - a bilingual (Polish/English) accessibility chatbot using RAG with LanceDB and OpenAI GPT-4.
4
+
5
+ ## Build, Test, and Run Commands
6
+
7
+ ### Running the Application
8
+ ```bash
9
+ # Local development - starts Gradio UI at http://127.0.0.1:7860
10
+ python app.py
11
+
12
+ # Run all startup tests before deployment
13
+ python test_startup.py
14
+ ```
15
+
16
+ ### Environment Setup
17
+ ```bash
18
+ # Install dependencies
19
+ pip install -r requirements.txt
20
+
21
+ # Configure environment (required before first run)
22
+ cp .env.example .env
23
+ # Edit .env and add your OPENAI_API_KEY
24
+ ```
25
+
26
+ ### Database Management
27
+ ```bash
28
+ # Compact LanceDB (removes version history, reduces file count)
29
+ python compact_database.py
30
+
31
+ # Check document count
32
+ python -c "import lancedb; db = lancedb.connect('./lancedb'); print(len(db.open_table('a11y_expert')))"
33
+ ```
34
+
35
+ ### Testing
36
+ ```bash
37
+ # Run full test suite (imports, config, vector store, embeddings, agent)
38
+ python test_startup.py
39
+
40
+ # All tests must pass before deploying to Hugging Face Spaces
41
+ ```
42
+
43
+ ## Architecture Overview
44
+
45
+ ### Core Components
46
+
47
+ **Agent System** (`agent/`)
48
+ - `a11y_agent.py`: Main `A11yExpertAgent` class with streaming responses via OpenAI
49
+ - `prompts.py`: Language-specific system prompts (Polish/English) with **strict language enforcement**
50
+ - `tools.py`: RAG tools for knowledge base search (top-5 semantic results)
51
+
52
+ **Vector Store** (`database/`)
53
+ - `vector_store_client.py`: LanceDB client with lazy loading and automatic reconnection
54
+ - Database path: `./lancedb/a11y_expert.lance` (tracked with Git LFS)
55
+ - **READ-ONLY in production** (Hugging Face Spaces environment)
56
+
57
+ **Embeddings** (`models/`)
58
+ - `embeddings.py`: OpenAI embeddings client with disk caching (`./cache/embeddings`) and retry logic
59
+ - Model: `text-embedding-3-large` (3072 dimensions)
60
+ - Singleton pattern: use `get_embeddings_client()` for shared instance
61
+
62
+ **UI** (`app.py`)
63
+ - Gradio ChatInterface with two-column layout (chat + notes from `notes.md`)
64
+ - **Lazy agent initialization** - agent loads on first user query, not at startup
65
+ - Streaming responses for better UX
66
+
67
+ **Configuration** (`config.py`)
68
+ - Pydantic settings with environment variable support
69
+ - All config loaded from `.env` file (never hardcode secrets)
70
+ - Required: `OPENAI_API_KEY` (OpenAI API key for LLM and embeddings)
71
+
72
+ ### Data Flow (RAG Pipeline)
73
+
74
+ 1. User asks question in Gradio UI
75
+ 2. Language detected from query using `langdetect` (Polish or English)
76
+ 3. Query embedded using OpenAI embeddings API (with cache lookup)
77
+ 4. Vector search in LanceDB (filtered by language: `where="language = 'pl'"` or `'en'`)
78
+ 5. Top 5 results formatted as context
79
+ 6. Context + query + language-specific system prompt sent to GPT-4
80
+ 7. Response streamed back to UI token-by-token
81
+
82
+ ### Key Design Patterns
83
+
84
+ - **Lazy Initialization**: Agent and database connections initialize on first use, not at startup (faster deployment)
85
+ - **Singleton Pattern**: `get_embeddings_client()` returns shared instance across the app
86
+ - **Language Detection**: Auto-detects query language and adjusts both prompt and vector search filter
87
+ - **Stateless Agent**: No internal conversation history (Gradio handles history in UI)
88
+ - **Conversation Context**: Last 4 messages kept in context for follow-up questions
89
+
90
+ ## Key Conventions
91
+
92
+ ### Language Handling - CRITICAL
93
+
94
+ The agent has **strict language enforcement** in system prompts:
95
+ - Polish queries get `SYSTEM_PROMPT_PL` with "CRITICAL: Answer ONLY in Polish"
96
+ - English queries get `SYSTEM_PROMPT_EN` with "CRITICAL: Answer ONLY in English"
97
+ - System prompts explicitly instruct the LLM to translate sources if needed
98
+ - Vector search is language-filtered: `where="language = 'pl'"` or `where="language = 'en'"`
99
+
100
+ **When modifying prompts**: Never remove or weaken the language enforcement instructions - they prevent language mixing which confuses users.
101
+
102
+ ### LanceDB Database - READ-ONLY in Production
103
+
104
+ - Database at `./lancedb/` is tracked with Git LFS (not generated at runtime)
105
+ - In Hugging Face Spaces: database is read-only (filesystem is immutable)
106
+ - For local development: use `VectorStoreClient.add_documents()` to add data
107
+ - After local changes: run `compact_database.py` to reduce file count before committing
108
+ - Schema: `text`, `vector`, `source`, `language`, `doc_type`, `created_at`, `updated_at`
109
+
110
+ ### Configuration Loading
111
+
112
+ All settings in `config.py` are loaded from environment variables:
113
+ ```python
114
+ from config import get_settings
115
+
116
+ settings = get_settings() # Singleton, cached
117
+ print(settings.llm_model) # gpt-4o (default)
118
+ ```
119
+
120
+ Never access environment variables directly - always use `get_settings()`.
121
+
122
+ ### Hugging Face Spaces Deployment
123
+
124
+ **Critical deployment requirements**:
125
+ 1. `demo.queue()` must be called explicitly (see `app.py:238-243`)
126
+ 2. Do **NOT** use `atexit.register()` for cleanup (causes premature shutdown)
127
+ 3. LanceDB must be committed with Git LFS (database is read-only in HF)
128
+ 4. API key stored as HF Spaces Secret: `OPENAI_API_KEY`
129
+ 5. The `if __name__ == "__main__"` block handles both local and HF deployments
130
+
131
+ **Testing before deployment**:
132
+ ```bash
133
+ python test_startup.py # All tests must pass
134
+ ```
135
+
136
+ ### Logging
137
+
138
+ Use loguru for all logging (already configured):
139
+ ```python
140
+ from loguru import logger
141
+
142
+ logger.info("Starting process...")
143
+ logger.success("✅ Completed successfully")
144
+ logger.error(f"❌ Failed: {error}")
145
+ ```
146
+
147
+ Set `LOG_LEVEL=DEBUG` in `.env` for verbose output during development.
148
+
149
+ ### Error Handling
150
+
151
+ - Always close resources in agent/client classes (implement `close()` method)
152
+ - Use try/except with specific exception types
153
+ - Log full traceback for debugging: `logger.error(traceback.format_exc())`
154
+ - For user-facing errors, provide clear Polish/English messages depending on detected language
155
+
156
+ ## Project Structure
157
+
158
+ ```
159
+ JacekAI/
160
+ ├── agent/ # Core agent logic
161
+ │ ├── a11y_agent.py # Main agent with RAG
162
+ │ ├── prompts.py # Language-specific prompts (PL/EN)
163
+ │ └── tools.py # Knowledge base search tools
164
+ ├── database/
165
+ │ └── vector_store_client.py # LanceDB client
166
+ ├── models/
167
+ │ └── embeddings.py # OpenAI embeddings with caching
168
+ ├── lancedb/ # Vector database (Git LFS)
169
+ │ └── a11y_expert.lance/
170
+ ├── cache/ # Embeddings cache (gitignored)
171
+ ├── app.py # Gradio UI with lazy initialization
172
+ ├── config.py # Pydantic settings (environment variables)
173
+ ├── test_startup.py # Deployment readiness tests
174
+ ├── compact_database.py # Database compaction utility
175
+ ├── requirements.txt # Python dependencies
176
+ ├── .env.example # Environment template
177
+ └── notes.md # Optional notes displayed in UI sidebar
178
+ ```
179
+
180
+ ## Important Implementation Notes
181
+
182
+ ### When Adding New Features to Agent
183
+
184
+ 1. Modifying prompts → Edit `agent/prompts.py`
185
+ 2. Adding new tools → Add function to `agent/tools.py`
186
+ 3. Changing RAG logic → Modify `agent/a11y_agent.py`
187
+ 4. Test locally with `python app.py` and interact through UI
188
+
189
+ ### When Updating Dependencies
190
+
191
+ 1. Edit `requirements.txt`
192
+ 2. Run `pip install -r requirements.txt`
193
+ 3. Test with `python test_startup.py`
194
+ 4. Commit changes and test in HF Spaces
195
+
196
+ ### When Debugging
197
+
198
+ - Set `LOG_LEVEL=DEBUG` in `.env` for verbose logging
199
+ - Agent initialization happens on first query (check logs for "A11yExpertAgent initialized")
200
+ - Embeddings cache is at `./cache/embeddings` (create directory if missing)
201
+ - Vector search logs show retrieved context from database
202
+
203
+ ## Common Pitfalls
204
+
205
+ 1. **DO NOT** modify the database in production (LanceDB is read-only on HF Spaces)
206
+ 2. **DO NOT** use `atexit.register()` in `app.py` (breaks HF Spaces deployment)
207
+ 3. **DO NOT** weaken language enforcement in prompts (causes confusing mixed-language responses)
208
+ 4. **DO NOT** access `os.environ` directly - always use `get_settings()`
209
+ 5. **DO NOT** initialize agent at module level - use lazy initialization pattern
210
+ 6. **DO NOT** forget to call `demo.queue()` before `demo.launch()` in Gradio
211
+
212
+ ## Environment Variables
213
+
214
+ Required in `.env` file:
215
+ - `OPENAI_API_KEY` - OpenAI API key for LLM and embeddings - **REQUIRED**
216
+
217
+ Optional (with defaults):
218
+ - `LLM_MODEL` - Language model (default: `gpt-4o-mini`)
219
+ - `LLM_BASE_URL` - API endpoint (default: GitHub Models endpoint)
220
+ - `EMBEDDING_MODEL` - Embedding model (default: `text-embedding-3-large`)
221
+ - `LANCEDB_URI` - Database path (default: `./lancedb`)
222
+ - `LANCEDB_TABLE` - Table name (default: `a11y_expert`)
223
+ - `LOG_LEVEL` - Logging verbosity (default: `INFO`)
224
+ - `SERVER_HOST` - Gradio host (default: `127.0.0.1`, use `0.0.0.0` for HF)
225
+ - `SERVER_PORT` - Gradio port (default: `7860`)
226
+
227
+ ## Related Documentation
228
+
229
+ - `CLAUDE.md` - Detailed guidance for Claude Code (includes architectural details)
230
+ - `README.md` - User-facing documentation with setup instructions
231
+ - `HF_SPACES_GUIDE.md` - Hugging Face Spaces deployment guide
232
+ - `QUICK_REFERENCE.md` - Quick reference for common tasks
.gitignore CHANGED
@@ -44,6 +44,10 @@ qa_dataset.jsonl
44
  .env
45
  .env.local
46
 
 
 
 
 
47
  # OS
48
  .DS_Store
49
  Thumbs.db"
 
44
  .env
45
  .env.local
46
 
47
+ # Node.js (GitHub Copilot CLI)
48
+ node_modules/
49
+ package-lock.json
50
+
51
  # OS
52
  .DS_Store
53
  Thumbs.db"
agent/a11y_agent.py CHANGED
@@ -255,13 +255,14 @@ def create_agent(language: Optional[str] = None) -> A11yExpertAgent:
255
  # Create vector store with lazy connection (no DB access yet)
256
  logger.info("Initializing vector store client...")
257
  vector_store = VectorStoreClient(uri=settings.lancedb_uri)
258
-
259
- github_token = settings.github_token
260
 
 
 
261
  logger.info("Initializing OpenAI client...")
262
- client_args = {"api_key": github_token}
263
  if settings.llm_base_url:
264
  client_args["base_url"] = settings.llm_base_url
 
265
  llm_client = OpenAI(**client_args)
266
 
267
  logger.info("Creating A11yExpertAgent instance...")
 
255
  # Create vector store with lazy connection (no DB access yet)
256
  logger.info("Initializing vector store client...")
257
  vector_store = VectorStoreClient(uri=settings.lancedb_uri)
 
 
258
 
259
+ api_key = settings.openai_api_key
260
+
261
  logger.info("Initializing OpenAI client...")
262
+ client_args = {"api_key": api_key}
263
  if settings.llm_base_url:
264
  client_args["base_url"] = settings.llm_base_url
265
+
266
  llm_client = OpenAI(**client_args)
267
 
268
  logger.info("Creating A11yExpertAgent instance...")
config.py CHANGED
@@ -16,11 +16,11 @@ class Settings(BaseSettings):
16
  """
17
  Application settings loaded from environment variables or .env file.
18
 
19
- All settings have sensible defaults except for the GitHub token,
20
- which must be provided via the GITHUB_TOKEN environment variable.
21
 
22
  Attributes:
23
- github_token: GitHub token - used as API key for OpenAI-compatible endpoints (required)
24
  llm_model: Language model to use for chat completions
25
  llm_base_url: Base URL for OpenAI API (supports GitHub Models)
26
  embedding_model: Model to use for text embeddings
@@ -39,15 +39,15 @@ class Settings(BaseSettings):
39
  """
40
 
41
  # API Configuration (required)
42
- github_token: str = Field(
43
  default="",
44
- description="GitHub token - required for LLM and embeddings",
45
- validation_alias="GITHUB_TOKEN"
46
  )
47
 
48
  # LLM Configuration
49
  llm_model: str = Field(
50
- default="gpt-4o",
51
  description="Language model for chat completions"
52
  )
53
  llm_base_url: Optional[str] = Field(
@@ -107,17 +107,17 @@ class Settings(BaseSettings):
107
  description="Public URL for social media sharing"
108
  )
109
 
110
- @field_validator("github_token")
111
  @classmethod
112
- def validate_github_token(cls, v):
113
- """Ensure GitHub token is provided and not empty."""
114
  v = v or ""
115
  v = v.strip()
116
  if not v:
117
  import os
118
  if not os.getenv("SPACE_ID"):
119
  raise ValueError(
120
- "GITHUB_TOKEN is required. "
121
  "Set it in your .env file or environment variables."
122
  )
123
  return v
 
16
  """
17
  Application settings loaded from environment variables or .env file.
18
 
19
+ All settings have sensible defaults except for the OpenAI API key,
20
+ which must be provided via the OPENAI_API_KEY environment variable.
21
 
22
  Attributes:
23
+ openai_api_key: OpenAI API key (required)
24
  llm_model: Language model to use for chat completions
25
  llm_base_url: Base URL for OpenAI API (supports GitHub Models)
26
  embedding_model: Model to use for text embeddings
 
39
  """
40
 
41
  # API Configuration (required)
42
+ openai_api_key: str = Field(
43
  default="",
44
+ description="OpenAI API key - required for LLM and embeddings",
45
+ validation_alias="OPENAI_API_KEY"
46
  )
47
 
48
  # LLM Configuration
49
  llm_model: str = Field(
50
+ default="gpt-4o-mini",
51
  description="Language model for chat completions"
52
  )
53
  llm_base_url: Optional[str] = Field(
 
107
  description="Public URL for social media sharing"
108
  )
109
 
110
+ @field_validator("openai_api_key")
111
  @classmethod
112
+ def validate_api_key(cls, v):
113
+ """Ensure API key is provided and not empty."""
114
  v = v or ""
115
  v = v.strip()
116
  if not v:
117
  import os
118
  if not os.getenv("SPACE_ID"):
119
  raise ValueError(
120
+ "OPENAI_API_KEY is required. "
121
  "Set it in your .env file or environment variables."
122
  )
123
  return v
models/embeddings.py CHANGED
@@ -60,7 +60,7 @@ class EmbeddingsClient:
60
 
61
  logger.info(f"Initializing EmbeddingsClient with {self.settings.llm_base_url}")
62
  self.client = OpenAI(
63
- api_key=self.settings.github_token,
64
  base_url=self.settings.llm_base_url
65
  )
66
 
 
60
 
61
  logger.info(f"Initializing EmbeddingsClient with {self.settings.llm_base_url}")
62
  self.client = OpenAI(
63
+ api_key=self.settings.openai_api_key,
64
  base_url=self.settings.llm_base_url
65
  )
66
 
test_startup.py CHANGED
@@ -32,7 +32,7 @@ def test_config():
32
  from config import get_settings
33
  import os
34
 
35
- os.environ.setdefault("GITHUB_TOKEN", "test-token-for-validation")
36
 
37
  settings = get_settings()
38
  logger.info(f"LLM Model: {settings.llm_model}")
@@ -52,7 +52,7 @@ def test_vector_store():
52
  from database.vector_store_client import VectorStoreClient
53
  import os
54
 
55
- os.environ.setdefault("GITHUB_TOKEN", "test-token-for-validation")
56
 
57
  settings = get_settings()
58
  client = VectorStoreClient(uri=settings.lancedb_uri)
@@ -90,7 +90,7 @@ def test_agent():
90
  from agent.a11y_agent import create_agent
91
  import os
92
 
93
- os.environ.setdefault("GITHUB_TOKEN", "test-token-for-validation")
94
 
95
  agent = create_agent()
96
  logger.info(f"Agent language: {agent.language}")
 
32
  from config import get_settings
33
  import os
34
 
35
+ os.environ.setdefault("OPENAI_API_KEY", "test-key-for-validation")
36
 
37
  settings = get_settings()
38
  logger.info(f"LLM Model: {settings.llm_model}")
 
52
  from database.vector_store_client import VectorStoreClient
53
  import os
54
 
55
+ os.environ.setdefault("OPENAI_API_KEY", "test-key-for-validation")
56
 
57
  settings = get_settings()
58
  client = VectorStoreClient(uri=settings.lancedb_uri)
 
90
  from agent.a11y_agent import create_agent
91
  import os
92
 
93
+ os.environ.setdefault("OPENAI_API_KEY", "test-key-for-validation")
94
 
95
  agent = create_agent()
96
  logger.info(f"Agent language: {agent.language}")