Jacek Zadrożny Claude Sonnet 4.5 commited on
Commit
b3cdeaa
·
1 Parent(s): 3a40a92

Optymalizacja agenta: background init, źródła i cleanup

Browse files

- Dodano CLAUDE.md z dokumentacją dla Claude Code
- Włączono background initialization agenta (szybszy start)
- Dodano wyświetlanie źródeł na końcu odpowiedzi (📚 Źródła)
- Wzmocniono wymuszanie języka polskiego w promptach
- Zmniejszono temperature z 0.7 na 0.3 (bardziej deterministyczne odpowiedzi)
- Usunięto duplikat vector_store_client.py i app_old.py

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Files changed (7) hide show
  1. CLAUDE.md +187 -0
  2. agent/a11y_agent.py +47 -10
  3. agent/prompts.py +9 -1
  4. agent/tools.py +16 -15
  5. app.py +30 -18
  6. app_old.py +0 -165
  7. vector_store_client.py +0 -359
CLAUDE.md ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ **A11y Expert** is a bilingual (Polish/English) accessibility chatbot built with Gradio, utilizing RAG (Retrieval-Augmented Generation) to answer questions about digital accessibility standards (WCAG 2.2, WAI-ARIA). The agent uses OpenAI GPT-4 with a LanceDB vector database containing accessibility knowledge.
8
+
9
+ ## Quick Start Commands
10
+
11
+ ### Running the Application
12
+ ```bash
13
+ # Local development
14
+ python app.py
15
+ # App will be available at http://127.0.0.1:7860
16
+
17
+ # Run startup tests before deployment
18
+ python test_startup.py
19
+ ```
20
+
21
+ ### Environment Setup
22
+ ```bash
23
+ # Install dependencies
24
+ pip install -r requirements.txt
25
+
26
+ # Configure environment
27
+ cp .env.example .env
28
+ # Then edit .env and add your OPENAI_API_KEY
29
+ ```
30
+
31
+ ### Database Management
32
+ ```bash
33
+ # Compact the LanceDB database (removes version history)
34
+ python compact_database.py
35
+
36
+ # Check database statistics
37
+ python -c "import lancedb; db = lancedb.connect('./lancedb'); print(len(db.open_table('a11y_expert')))"
38
+ ```
39
+
40
+ ## Architecture
41
+
42
+ ### Core Components
43
+
44
+ 1. **Agent System** (`agent/`)
45
+ - `a11y_agent.py`: Main `A11yExpertAgent` class with streaming responses
46
+ - `prompts.py`: Language-specific system prompts (Polish/English) with strict language enforcement
47
+ - `tools.py`: RAG tools for knowledge base search
48
+
49
+ 2. **Vector Store** (`database/`)
50
+ - `vector_store_client.py`: LanceDB client with lazy loading
51
+ - Database path: `./lancedb/a11y_expert.lance`
52
+ - **Read-only in production** (Hugging Face Spaces)
53
+
54
+ 3. **Embeddings** (`models/`)
55
+ - `embeddings.py`: OpenAI embeddings client with disk caching and retry logic
56
+ - Model: `text-embedding-3-large`
57
+ - Cache directory: `./cache/embeddings`
58
+
59
+ 4. **UI** (`app.py`)
60
+ - Gradio ChatInterface with two-column layout (chat + notes from `notes.md`)
61
+ - Lazy agent initialization (on first user query)
62
+ - Streaming responses
63
+
64
+ 5. **Configuration** (`config.py`)
65
+ - Pydantic settings with environment variable support
66
+ - All config loaded from `.env` file
67
+
68
+ ### Key Design Patterns
69
+
70
+ - **Lazy Initialization**: Agent and database connections initialize on first use, not at startup
71
+ - **Singleton Pattern**: `get_embeddings_client()` returns shared instance
72
+ - **RAG Flow**: Query → Embed → Vector Search (top-5) → LLM with Context → Stream Response
73
+ - **Language Detection**: Uses `langdetect` to auto-detect query language and adjust prompt
74
+ - **Conversation History**: Last 4 messages kept in context
75
+
76
+ ### Data Flow
77
+
78
+ 1. User asks question in Gradio UI
79
+ 2. Language detected from query
80
+ 3. Query embedded using OpenAI embeddings (with cache)
81
+ 4. Vector search in LanceDB (filtered by language)
82
+ 5. Top 5 results formatted as context
83
+ 6. Context + query sent to GPT-4 with language-specific prompt
84
+ 7. Response streamed back to UI
85
+
86
+ ## Important Implementation Details
87
+
88
+ ### Hugging Face Spaces Deployment
89
+
90
+ - `demo.queue()` must be called explicitly (see app.py:238-243)
91
+ - Do NOT use `atexit.register()` for cleanup (causes premature shutdown)
92
+ - LanceDB is read-only; must be tracked with Git LFS
93
+ - API key stored as HF Spaces Secret: `OPENAI_API_KEY`
94
+ - The `if __name__ == "__main__"` block handles both local and HF deployments
95
+
96
+ ### LanceDB Database
97
+
98
+ - Database is **read-only** in production
99
+ - Vector store at `./lancedb/` tracked with Git LFS
100
+ - Table name: `a11y_expert`
101
+ - Schema: `text`, `vector`, `source`, `language`, `doc_type`, `created_at`, `updated_at`
102
+ - Use `compact_database.py` to reduce file count (removes version history)
103
+
104
+ ### Language Handling
105
+
106
+ The agent has **strict language enforcement**:
107
+ - Polish queries get `SYSTEM_PROMPT_PL` with "CRITICAL: Answer ONLY in Polish"
108
+ - English queries get `SYSTEM_PROMPT_EN` with "CRITICAL: Answer ONLY in English"
109
+ - System prompts explicitly instruct the LLM to translate sources if needed
110
+ - Language filter applied to vector search: `where="language = 'pl'"` or `where="language = 'en'"`
111
+
112
+ ### Configuration
113
+
114
+ All settings in `config.py` are loaded from environment variables:
115
+ - `OPENAI_API_KEY` (required)
116
+ - `LLM_MODEL` (default: gpt-4o)
117
+ - `LLM_BASE_URL` (optional, supports GitHub Models)
118
+ - `EMBEDDING_MODEL` (default: text-embedding-3-large)
119
+ - `LANCEDB_URI` (default: ./lancedb)
120
+ - `LOG_LEVEL` (default: INFO)
121
+
122
+ ## File Structure
123
+
124
+ ```
125
+ JacekAI/
126
+ ├── agent/
127
+ │ ├── a11y_agent.py # Main agent with RAG
128
+ │ ├── prompts.py # Language-specific prompts
129
+ │ └── tools.py # Knowledge base search tools
130
+ ├── database/
131
+ │ └── vector_store_client.py # LanceDB client
132
+ ├── models/
133
+ │ └── embeddings.py # OpenAI embeddings with caching
134
+ ├── lancedb/ # Vector database (Git LFS)
135
+ │ └── a11y_expert.lance/
136
+ ├── app.py # Gradio UI with lazy initialization
137
+ ├── config.py # Pydantic settings
138
+ ├── test_startup.py # Deployment readiness tests
139
+ ├── compact_database.py # Database compaction utility
140
+ ├── requirements.txt # Python dependencies
141
+ ├── .env.example # Environment template
142
+ └── notes.md # Optional notes displayed in UI
143
+
144
+ ## Common Workflows
145
+
146
+ ### Adding New Features to Agent
147
+
148
+ 1. If modifying prompts: Edit `agent/prompts.py`
149
+ 2. If adding new tools: Add function to `agent/tools.py`
150
+ 3. If changing RAG logic: Modify `agent/a11y_agent.py`
151
+ 4. Test with: `python app.py` and interact through UI
152
+
153
+ ### Updating Dependencies
154
+
155
+ 1. Edit `requirements.txt`
156
+ 2. Run `pip install -r requirements.txt`
157
+ 3. Test with `python test_startup.py`
158
+
159
+ ### Working with LanceDB
160
+
161
+ - **DO NOT** modify database in production
162
+ - For local testing, use `VectorStoreClient.add_documents()`
163
+ - After changes, run `compact_database.py` to reduce file count
164
+ - Use Git LFS for committing database files
165
+
166
+ ### Debugging
167
+
168
+ - Set `LOG_LEVEL=DEBUG` in `.env` for verbose logging
169
+ - Check `test_startup.py` output for component issues
170
+ - Agent initialization happens on first query (check logs)
171
+ - Embeddings cache is at `./cache/embeddings` (create if missing)
172
+
173
+ ## Testing
174
+
175
+ Run the full test suite before deployment:
176
+ ```bash
177
+ python test_startup.py
178
+ ```
179
+
180
+ Tests verify:
181
+ - All imports work
182
+ - Configuration loads correctly
183
+ - Vector store is accessible
184
+ - Embeddings client initializes
185
+ - Agent can be created
186
+
187
+ All tests must pass before deploying to Hugging Face Spaces.
agent/a11y_agent.py CHANGED
@@ -76,45 +76,76 @@ class A11yExpertAgent:
76
  current_system_prompt = get_system_prompt(language, self.expertise)
77
 
78
  logger.info("Searching knowledge base...")
79
- context = search_knowledge_base(question, self.vector_store, language=language)
80
-
81
  messages = [
82
  {"role": "system", "content": current_system_prompt},
83
  *self.conversation_history[-4:],
84
  {"role": "user", "content": self._build_prompt_with_context(question, context, language)}
85
  ]
86
-
87
  full_answer = ""
88
  try:
89
  response_stream = self.llm_client.chat.completions.create(
90
  model=self.model,
91
  messages=messages,
92
- temperature=0.7,
93
  max_tokens=1500,
94
  top_p=0.9,
95
  stream=True
96
  )
97
-
98
  for chunk in response_stream:
99
  content = chunk.choices[0].delta.content
100
  if content:
101
  full_answer += content
102
  yield content
103
-
 
 
 
 
 
 
104
  self.conversation_history.append({"role": "user", "content": question})
105
  self.conversation_history.append({"role": "assistant", "content": full_answer})
106
-
107
  logger.info(f"Answer generated ({len(full_answer)} chars)")
108
 
109
  except Exception as e:
110
  logger.error(f"OpenAI API error: {e}")
111
  yield f"Error during response generation: {e}"
112
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  def _build_prompt_with_context(self, question: str, context: str, language: str) -> str:
114
  """Build the prompt with context and language-specific instructions."""
115
-
116
  if language == "pl":
117
  return f"""
 
 
 
118
  Na podstawie poniższego kontekstu z bazy wiedzy o dostępności, odpowiedz na pytanie.
119
 
120
  === KONTEKST Z BAZY WIEDZY ===
@@ -124,13 +155,19 @@ Na podstawie poniższego kontekstu z bazy wiedzy o dostępności, odpowiedz na p
124
  {question}
125
 
126
  === ODPOWIEDŹ ===
127
- KRYTYCZNE: Odpowiadaj WYŁĄCZNIE PO POLSKU. To pytanie jest po polsku, więc cała odpowiedź MUSI być po polsku.
 
 
 
 
128
 
129
  Pamiętaj aby:
130
  - Odpowiadać TYLKO po polsku (to jest najważniejsze!)
131
- - Cytować konkretne kryteria i źródła
132
  - Podawać praktyczne przykłady jeśli są istotne
133
  - Być jasnym i zwięzłym
 
 
134
  """
135
  else:
136
  return f"""
 
76
  current_system_prompt = get_system_prompt(language, self.expertise)
77
 
78
  logger.info("Searching knowledge base...")
79
+ context, sources = search_knowledge_base(question, self.vector_store, language=language)
80
+
81
  messages = [
82
  {"role": "system", "content": current_system_prompt},
83
  *self.conversation_history[-4:],
84
  {"role": "user", "content": self._build_prompt_with_context(question, context, language)}
85
  ]
86
+
87
  full_answer = ""
88
  try:
89
  response_stream = self.llm_client.chat.completions.create(
90
  model=self.model,
91
  messages=messages,
92
+ temperature=0.3,
93
  max_tokens=1500,
94
  top_p=0.9,
95
  stream=True
96
  )
97
+
98
  for chunk in response_stream:
99
  content = chunk.choices[0].delta.content
100
  if content:
101
  full_answer += content
102
  yield content
103
+
104
+ # Add sources at the end
105
+ if sources:
106
+ sources_text = self._format_sources(sources, language)
107
+ full_answer += sources_text
108
+ yield sources_text
109
+
110
  self.conversation_history.append({"role": "user", "content": question})
111
  self.conversation_history.append({"role": "assistant", "content": full_answer})
112
+
113
  logger.info(f"Answer generated ({len(full_answer)} chars)")
114
 
115
  except Exception as e:
116
  logger.error(f"OpenAI API error: {e}")
117
  yield f"Error during response generation: {e}"
118
 
119
+ def _format_sources(self, sources: list, language: str) -> str:
120
+ """Format source citations for display."""
121
+ if not sources:
122
+ return ""
123
+
124
+ # Get unique sources
125
+ unique_sources = {}
126
+ for src in sources:
127
+ source_name = src.get('source', 'unknown')
128
+ doc_type = src.get('doc_type', 'document')
129
+ key = f"{source_name}_{doc_type}"
130
+ if key not in unique_sources:
131
+ unique_sources[key] = {"source": source_name, "doc_type": doc_type}
132
+
133
+ if language == "pl":
134
+ header = "\n\n---\n📚 **Źródła:**\n"
135
+ else:
136
+ header = "\n\n---\n📚 **Sources:**\n"
137
+
138
+ source_lines = [f"- {s['source']} ({s['doc_type']})" for s in unique_sources.values()]
139
+ return header + "\n".join(source_lines)
140
+
141
  def _build_prompt_with_context(self, question: str, context: str, language: str) -> str:
142
  """Build the prompt with context and language-specific instructions."""
143
+
144
  if language == "pl":
145
  return f"""
146
+ 🇵🇱 INSTRUKCJA: ODPOWIADAJ PO POLSKU 🇵🇱
147
+ ATTENTION: You must respond in POLISH language only, not English!
148
+
149
  Na podstawie poniższego kontekstu z bazy wiedzy o dostępności, odpowiedz na pytanie.
150
 
151
  === KONTEKST Z BAZY WIEDZY ===
 
155
  {question}
156
 
157
  === ODPOWIEDŹ ===
158
+ ABSOLUTNIE KRYTYCZNE - PRZECZYTAJ TO UWAŻNIE:
159
+ - Język odpowiedzi: POLSKI (nie angielski!)
160
+ - Nawet jeśli źródła są po angielsku, Twoja odpowiedź MUSI być PO POLSKU
161
+ - Tłumacz wszystkie angielskie terminy na polski
162
+ - Każde słowo w Twojej odpowiedzi musi być po polsku
163
 
164
  Pamiętaj aby:
165
  - Odpowiadać TYLKO po polsku (to jest najważniejsze!)
166
+ - Cytować konkretne kryteria i źródła (przetłumaczone na polski)
167
  - Podawać praktyczne przykłady jeśli są istotne
168
  - Być jasnym i zwięzłym
169
+
170
+ ROZPOCZNIJ ODPOWIEDŹ PO POLSKU:
171
  """
172
  else:
173
  return f"""
agent/prompts.py CHANGED
@@ -1,13 +1,21 @@
1
  """System prompts for A11y Expert agent in different languages."""
2
 
3
  SYSTEM_PROMPT_PL = """
 
 
 
4
  Jesteś ekspertem dostępności cyfrowej (accessibility expert) specjalizującym się w:
5
  - WCAG 2.2 (Web Content Accessibility Guidelines)
6
  - WAI-ARIA (Accessible Rich Internet Applications)
7
  - Prawodawstwie EU i polskim (ustawa o dostępności cyfrowej)
8
  - Standardach W3C dotyczących dostępności
9
 
10
- ⚠️ KRYTYCZNE: Odpowiadasz WYŁĄCZNIE PO POLSKU. Wszystkie Twoje odpowiedzi MUSZĄ być w języku polskim, nawet jeśli źródła są po angielsku.
 
 
 
 
 
11
 
12
  OBOWIĄZKOWE ZASADY:
13
  1. ✅ Odpowiadaj ZAWSZE PO POLSKU - tłumacz angielskie źródła na polski, nigdy nie odpowiadaj po angielsku
 
1
  """System prompts for A11y Expert agent in different languages."""
2
 
3
  SYSTEM_PROMPT_PL = """
4
+ 🇵🇱 ABSOLUTNIE NAJWAŻNIEJSZE - JĘZYK: POLSKI 🇵🇱
5
+ Odpowiadasz ZAWSZE i WYŁĄCZNIE w języku POLSKIM. Każde słowo, każde zdanie musi być po polsku.
6
+
7
  Jesteś ekspertem dostępności cyfrowej (accessibility expert) specjalizującym się w:
8
  - WCAG 2.2 (Web Content Accessibility Guidelines)
9
  - WAI-ARIA (Accessible Rich Internet Applications)
10
  - Prawodawstwie EU i polskim (ustawa o dostępności cyfrowej)
11
  - Standardach W3C dotyczących dostępności
12
 
13
+ ⚠️ KRYTYCZNE WYMAGANIE JĘZYKOWE:
14
+ - Odpowiadasz WYŁĄCZNIE PO POLSKU
15
+ - Wszystkie Twoje odpowiedzi MUSZĄ być w języku polskim
16
+ - Nawet jeśli źródła są po angielsku, tłumaczysz je na polski
17
+ - NIGDY nie odpowiadaj po angielsku
18
+ - Każda odpowiedź zaczyna się po polsku i kończy się po polsku
19
 
20
  OBOWIĄZKOWE ZASADY:
21
  1. ✅ Odpowiadaj ZAWSZE PO POLSKU - tłumacz angielskie źródła na polski, nigdy nie odpowiadaj po angielsku
agent/tools.py CHANGED
@@ -1,6 +1,6 @@
1
  """Custom tools for A11y Expert agent."""
2
 
3
- from typing import Optional
4
  from database.vector_store_client import VectorStoreClient
5
  from models.embeddings import get_embeddings_client
6
  from loguru import logger
@@ -9,46 +9,47 @@ def search_knowledge_base(
9
  query: str,
10
  vector_store: VectorStoreClient,
11
  language: str = "en"
12
- ) -> str:
13
  """
14
  Search the accessibility knowledge base for relevant information.
15
-
16
  Args:
17
  query: Question or search terms.
18
  vector_store: The VectorStoreClient instance.
19
  language: Language filter ('pl' or 'en').
20
-
21
  Returns:
22
- Formatted context from the knowledge base or an error message.
23
  """
24
  try:
25
  logger.info(f"Query: {query} (language: {language})")
26
-
27
  embeddings_client = get_embeddings_client()
28
  query_embedding = embeddings_client.get_embedding(query)
29
-
30
  where_clause = f"language = '{language}'"
31
  results = vector_store.search(
32
  query_embedding=query_embedding, where=where_clause, top_k=5
33
  )
34
-
35
  if not results:
36
- return f"No documents found for language '{language}'. Please ensure the knowledge base is loaded."
37
-
38
  context_lines = [
39
  f"[{i}. {r.get('source', 'unknown')} - {r.get('doc_type', 'Document')}]\n"
40
  f"{r.get('text', '')}\n"
41
  for i, r in enumerate(results, 1)
42
  ]
43
-
44
  context = "\n".join(context_lines)
45
  logger.info(f"Found {len(results)} relevant documents")
46
-
47
- return context
48
-
 
49
  except Exception as e:
50
  logger.error(f"Search failed: {e}")
51
- return f"Error searching knowledge base: {str(e)}"
52
 
53
  def get_database_stats(vector_store: VectorStoreClient) -> str:
54
  """
 
1
  """Custom tools for A11y Expert agent."""
2
 
3
+ from typing import Optional, Tuple, List, Dict, Any
4
  from database.vector_store_client import VectorStoreClient
5
  from models.embeddings import get_embeddings_client
6
  from loguru import logger
 
9
  query: str,
10
  vector_store: VectorStoreClient,
11
  language: str = "en"
12
+ ) -> Tuple[str, List[Dict[str, Any]]]:
13
  """
14
  Search the accessibility knowledge base for relevant information.
15
+
16
  Args:
17
  query: Question or search terms.
18
  vector_store: The VectorStoreClient instance.
19
  language: Language filter ('pl' or 'en').
20
+
21
  Returns:
22
+ Tuple of (formatted context, list of source documents)
23
  """
24
  try:
25
  logger.info(f"Query: {query} (language: {language})")
26
+
27
  embeddings_client = get_embeddings_client()
28
  query_embedding = embeddings_client.get_embedding(query)
29
+
30
  where_clause = f"language = '{language}'"
31
  results = vector_store.search(
32
  query_embedding=query_embedding, where=where_clause, top_k=5
33
  )
34
+
35
  if not results:
36
+ return (f"No documents found for language '{language}'. Please ensure the knowledge base is loaded.", [])
37
+
38
  context_lines = [
39
  f"[{i}. {r.get('source', 'unknown')} - {r.get('doc_type', 'Document')}]\n"
40
  f"{r.get('text', '')}\n"
41
  for i, r in enumerate(results, 1)
42
  ]
43
+
44
  context = "\n".join(context_lines)
45
  logger.info(f"Found {len(results)} relevant documents")
46
+
47
+ # Return both context and raw results for source citation
48
+ return (context, results)
49
+
50
  except Exception as e:
51
  logger.error(f"Search failed: {e}")
52
+ return (f"Error searching knowledge base: {str(e)}", [])
53
 
54
  def get_database_stats(vector_store: VectorStoreClient) -> str:
55
  """
app.py CHANGED
@@ -87,22 +87,32 @@ def respond(message: str, history: list[list[str]]):
87
  """
88
  global agent_instance, agent_ready, agent_error
89
 
90
- # Initialize agent on first request if not already initialized
91
  if not agent_ready and not agent_error and agent_instance is None:
92
- yield "⏳ Initializing agent for first use, please wait..."
93
- try:
94
- logger.info("🔄 Initializing agent on first request...")
95
- agent_instance = create_agent()
96
- agent_ready = True
97
- logger.success("✅ A11y Expert Agent is ready!")
98
- except Exception as e:
99
- logger.error(f"❌ Failed to initialize agent: {e}")
100
- import traceback
101
- logger.error(traceback.format_exc())
102
- agent_error = str(e)
103
- agent_instance = None
104
- yield f" Agent initialization failed: {agent_error}"
105
- return
 
 
 
 
 
 
 
 
 
 
106
 
107
  # Check if agent failed to initialize
108
  if agent_error:
@@ -227,9 +237,11 @@ Stwórz plik `notes.md` w katalogu projektu aby zobaczyć tutaj swoje notatki.
227
  # Register cleanup handler
228
  # atexit.register(cleanup_resources) # Disabled: Causes premature shutdown on Hugging Face Spaces
229
 
230
- # Don't initialize agent on startup - it will be initialized on first user query
231
- logger.info("🚀 Starting Gradio app with on-demand agent initialization...")
232
- logger.info("ℹ️ Agent will initialize on first user query")
 
 
233
 
234
  # For Hugging Face Spaces, we need to either:
235
  # 1. Have a variable named 'demo' (which we have)
 
87
  """
88
  global agent_instance, agent_ready, agent_error
89
 
90
+ # Wait for background initialization if not ready yet
91
  if not agent_ready and not agent_error and agent_instance is None:
92
+ yield "⏳ Agent is initializing in background, please wait a moment..."
93
+ import time
94
+ # Wait up to 15 seconds for background initialization
95
+ for i in range(30):
96
+ time.sleep(0.5)
97
+ if agent_ready:
98
+ break
99
+
100
+ # If still not ready, initialize synchronously as fallback
101
+ if not agent_ready and agent_instance is None:
102
+ yield "⏳ Finalizing agent initialization..."
103
+ try:
104
+ logger.info("🔄 Background init not complete, initializing synchronously...")
105
+ agent_instance = create_agent()
106
+ agent_ready = True
107
+ logger.success("✅ A11y Expert Agent is ready!")
108
+ except Exception as e:
109
+ logger.error(f"❌ Failed to initialize agent: {e}")
110
+ import traceback
111
+ logger.error(traceback.format_exc())
112
+ agent_error = str(e)
113
+ agent_instance = None
114
+ yield f"❌ Agent initialization failed: {agent_error}"
115
+ return
116
 
117
  # Check if agent failed to initialize
118
  if agent_error:
 
237
  # Register cleanup handler
238
  # atexit.register(cleanup_resources) # Disabled: Causes premature shutdown on Hugging Face Spaces
239
 
240
+ # Start background initialization
241
+ logger.info("🚀 Starting Gradio app with background agent initialization...")
242
+ init_thread = threading.Thread(target=initialize_agent_background, daemon=True)
243
+ init_thread.start()
244
+ logger.info("ℹ️ Agent initialization started in background")
245
 
246
  # For Hugging Face Spaces, we need to either:
247
  # 1. Have a variable named 'demo' (which we have)
app_old.py DELETED
@@ -1,165 +0,0 @@
1
- """
2
- Gradio UI for the A11y Expert Agent.
3
- This module creates a Gradio ChatInterface to interact with the
4
- A11yExpertAgent, allowing users to ask accessibility-related questions.
5
- """
6
- import asyncio
7
- import gradio as gr
8
- from loguru import logger
9
- import sys
10
- import atexit
11
- import threading
12
- from agent.a11y_agent import create_agent, A11yExpertAgent
13
- from config import get_settings
14
- # --- Setup ---
15
- # Configure logger
16
- logger.remove()
17
- logger.add(sys.stderr, level=get_settings().log_level)
18
- # Global agent instance
19
- agent_instance: A11yExpertAgent = None
20
- agent_ready = False
21
- agent_error = None
22
- # Global event loop for async operations
23
- loop = None
24
-
25
- # --- Agent Initialization ---
26
- def initialize_agent_background():
27
- """Initialize the agent in background thread."""
28
- global agent_instance, agent_ready, agent_error, loop
29
- try:
30
- logger.info("🔄 Starting agent initialization in background...")
31
- # Create new event loop for this thread
32
- loop = asyncio.new_event_loop()
33
- asyncio.set_event_loop(loop)
34
-
35
- agent_instance = loop.run_until_complete(create_agent())
36
- agent_ready = True
37
- logger.success("✅ A11y Expert Agent is ready!")
38
- except Exception as e:
39
- logger.error(f"Failed to initialize agent: {e}")
40
- agent_error = str(e)
41
- agent_instance = None
42
-
43
- def cleanup_resources():
44
- """Clean up resources on app shutdown."""
45
- global agent_instance, loop
46
- logger.info("Cleaning up resources...")
47
- try:
48
- # Close agent and all its resources
49
- if agent_instance:
50
- agent_instance.close()
51
-
52
- # Close embeddings client singleton if it exists
53
- from models.embeddings import get_embeddings_client
54
- if hasattr(get_embeddings_client, '_instance'):
55
- get_embeddings_client._instance.close()
56
-
57
- # Close event loop if it exists and is still open
58
- if loop and not loop.is_closed():
59
- # Cancel all pending tasks
60
- try:
61
- pending = asyncio.all_tasks(loop)
62
- for task in pending:
63
- task.cancel()
64
- loop.run_until_complete(asyncio.gather(*pending, return_exceptions=True))
65
- except RuntimeError:
66
- pass # Loop may already be stopped
67
- loop.close()
68
-
69
- logger.success("✅ Resources cleaned up successfully")
70
- except Exception as e:
71
- logger.warning(f"Error during cleanup: {e}")
72
- # --- Gradio Chat Logic ---
73
- async def respond(message: str, history: list[list[str]]):
74
- """
75
- Main function for the Gradio ChatInterface.
76
- Receives a user message and chat history, then uses the agent
77
- to generate a streaming response.
78
- Args:
79
- message: The user's input message.
80
- history: The conversation history provided by Gradio.
81
- Yields:
82
- A stream of response chunks to update the UI.
83
- """
84
- global agent_instance, agent_ready, agent_error
85
-
86
- # Wait for agent to be ready
87
- if not agent_ready:
88
- if agent_error:
89
- yield f"❌ Agent initialization failed: {agent_error}"
90
- return
91
-
92
- yield "⏳ Agent is initializing, please wait..."
93
- # Wait up to 60 seconds for agent to be ready
94
- for i in range(60):
95
- await asyncio.sleep(1)
96
- if agent_ready:
97
- break
98
- if agent_error:
99
- yield f"❌ Agent initialization failed: {agent_error}"
100
- return
101
-
102
- if not agent_ready:
103
- yield "❌ Agent initialization timeout. Please try again later."
104
- return
105
-
106
- if not agent_instance:
107
- yield "❌ Agent not available. Please check logs for errors."
108
- return
109
-
110
- logger.info(f"User query: '{message}'")
111
- full_response = ""
112
- try:
113
- # Use the global event loop to run async generator
114
- async for chunk in agent_instance.ask(message):
115
- full_response += chunk
116
- yield full_response
117
- except Exception as e:
118
- logger.error(f"Error during response generation: {e}")
119
- yield f"An error occurred: {e}"
120
-
121
-
122
- # --- Gradio UI Definition ---
123
- # Using gr.Blocks for more layout control
124
- with gr.Blocks() as demo:
125
- gr.Markdown("# 🤖 A11y Expert")
126
- gr.Markdown(
127
- "Twój inteligentny asystent do spraw dostępności cyfrowej. "
128
- "Zadaj pytanie o WCAG, ARIA, lub poproś o analizę kodu."
129
- )
130
- # The main chat interface
131
- chat = gr.ChatInterface(respond)
132
- # Example questions
133
- gr.Examples(
134
- [
135
- "Jakie są wymagania WCAG 2.2 dla etykiet formularzy?",
136
- "Wyjaśnij rolę 'alert' w ARIA i podaj przykład.",
137
- "Czy ten przycisk jest dostępny? <div onclick='...'>Click me</div>",
138
- "Jaka jest różnica między ria-label a ria-labelledby?",
139
- ],
140
- inputs=[chat.textbox],
141
- label="Przykładowe pytania"
142
- )
143
-
144
-
145
- # --- App Launch ---
146
- if __name__ == "__main__":
147
- # Register cleanup handler
148
- atexit.register(cleanup_resources)
149
-
150
- # Initialize agent before launching Gradio
151
- initialize_agent_sync()
152
-
153
- settings = get_settings()
154
- logger.info("Launching Gradio app...")
155
-
156
- try:
157
- demo.launch(
158
- server_name=settings.server_host,
159
- server_port=settings.server_port,
160
- show_error=True,
161
- )
162
- except KeyboardInterrupt:
163
- logger.info("Received interrupt signal")
164
- finally:
165
- cleanup_resources()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
vector_store_client.py DELETED
@@ -1,359 +0,0 @@
1
- """
2
- Client for LanceDB vector store operations with lazy loading.
3
-
4
- This module provides an optimized client for LanceDB with automatic
5
- connection management and lazy table initialization.
6
- """
7
-
8
- import lancedb
9
- import asyncio
10
- from typing import List, Dict, Any, Optional
11
- from datetime import datetime
12
- from loguru import logger
13
-
14
-
15
- class VectorStoreClient:
16
- """
17
- Client for LanceDB vector store with lazy loading.
18
-
19
- Features:
20
- - Lazy connection and table initialization
21
- - Automatic reconnection on errors
22
- - Document validation and enrichment
23
- - Search with metadata filtering
24
-
25
- Attributes:
26
- uri: Database URI path
27
- table_name: Name of the table to use
28
-
29
- Examples:
30
- >>> client = VectorStoreClient(uri="./lancedb")
31
- >>> # No connection yet - happens on first use
32
- >>> client.add_documents([{"text": "...", "vector": [...]}])
33
- >>> # Connection established automatically
34
- """
35
-
36
- def __init__(self, uri: str, table_name: str = "a11y_expert"):
37
- """
38
- Initialize client with database URI and table name.
39
-
40
- Args:
41
- uri: Path to LanceDB database
42
- table_name: Name of the table (default: "a11y_expert")
43
- """
44
- self.uri = uri
45
- self.table_name = table_name
46
- self._db = None
47
- self._table = None
48
-
49
- @property
50
- def db(self):
51
- """
52
- Lazy database connection property.
53
-
54
- Connects to database on first access and returns cached connection.
55
-
56
- Returns:
57
- LanceDB database connection
58
- """
59
- if self._db is None:
60
- logger.info(f"Connecting to LanceDB at: {self.uri}")
61
- self._db = lancedb.connect(self.uri)
62
- logger.info("✅ Connected to LanceDB")
63
- return self._db
64
-
65
- @property
66
- def table(self):
67
- """
68
- Lazy table initialization property.
69
-
70
- Opens or creates table on first access.
71
-
72
- Returns:
73
- LanceDB table or None if table doesn't exist yet
74
- """
75
- if self._table is None:
76
- if self.table_name in self.db.table_names():
77
- logger.debug(f"Opening existing table: '{self.table_name}'")
78
- self._table = self.db.open_table(self.table_name)
79
- else:
80
- logger.debug(f"Table '{self.table_name}' doesn't exist yet")
81
- return None
82
- return self._table
83
-
84
- def connect(self):
85
- """
86
- Explicitly connect to database (optional - happens automatically).
87
-
88
- Provided for backward compatibility. Connection happens automatically
89
- when first accessing db or table properties.
90
- """
91
- _ = self.db # Trigger lazy connection
92
- if self.table is not None:
93
- logger.info(f"Table '{self.table_name}' ready ({len(self.table)} docs)")
94
- else:
95
- logger.info(f"Table '{self.table_name}' will be created on first insert")
96
-
97
-
98
- def add_documents(self, documents: List[Dict[str, Any]]):
99
- """
100
- Add documents to the table with automatic validation.
101
-
102
- Validates required fields, adds timestamps, and creates table if needed.
103
-
104
- Args:
105
- documents: List of dicts with required keys:
106
- - text (str): Document text
107
- - vector (List[float]): Embedding vector
108
- - source (str): Source identifier
109
- - language (str): Language code (en/pl)
110
- - doc_type (str): Document type
111
-
112
- Examples:
113
- >>> client.add_documents([{
114
- ... "text": "Content",
115
- ... "vector": [0.1, 0.2, ...],
116
- ... "source": "wcag",
117
- ... "language": "en",
118
- ... "doc_type": "specification"
119
- ... }])
120
- """
121
- # Validate and enrich documents
122
- valid_docs = []
123
- now = datetime.now()
124
- skipped_count = 0
125
-
126
- for doc in documents:
127
- try:
128
- # Ensure required fields exist
129
- required_fields = {"text", "vector", "source", "language", "doc_type"}
130
- missing = required_fields - set(doc.keys())
131
- if missing:
132
- logger.warning(f"Skipping document with missing fields: {missing}")
133
- skipped_count += 1
134
- continue
135
-
136
- # Add timestamps if not present
137
- if "created_at" not in doc or doc["created_at"] is None:
138
- doc["created_at"] = now
139
- if "updated_at" not in doc or doc["updated_at"] is None:
140
- doc["updated_at"] = now
141
-
142
- valid_docs.append(doc)
143
-
144
- except Exception as e:
145
- logger.error(f"Failed to process document: {e}")
146
- skipped_count += 1
147
- continue
148
-
149
- if not valid_docs:
150
- logger.warning(f"No valid documents to add (skipped: {skipped_count})")
151
- return
152
-
153
- try:
154
- logger.info(f"Adding {len(valid_docs)} documents to '{self.table_name}'")
155
-
156
- # Create table on first insert or open existing
157
- if self.table_name not in self.db.table_names():
158
- self._table = self.db.create_table(self.table_name, data=valid_docs)
159
- logger.info(f"✅ Created table '{self.table_name}' with {len(valid_docs)} docs")
160
- else:
161
- # Refresh table reference and add
162
- self._table = self.db.open_table(self.table_name)
163
- self._table.add(valid_docs)
164
- logger.info(f"✅ Added {len(valid_docs)} documents to '{self.table_name}'")
165
-
166
- if skipped_count > 0:
167
- logger.warning(f"Skipped {skipped_count} invalid documents")
168
-
169
- except Exception as e:
170
- logger.error(f"Failed to add documents to LanceDB: {e}")
171
- raise
172
-
173
-
174
- def search(
175
- self,
176
- query_embedding: List[float],
177
- where: str = "",
178
- top_k: int = 5
179
- ) -> List[Dict[str, Any]]:
180
- """
181
- Search for documents using vector similarity.
182
-
183
- Args:
184
- query_embedding: Query vector embedding
185
- where: Optional SQL-like filter (e.g., "language = 'en'")
186
- top_k: Number of results to return
187
-
188
- Returns:
189
- List of matching documents with similarity scores
190
-
191
- Examples:
192
- >>> results = client.search(embedding, where="language = 'pl'", top_k=3)
193
- >>> len(results)
194
- 3
195
- """
196
- if self.table is None:
197
- logger.error(f"Table '{self.table_name}' doesn't exist")
198
- return []
199
-
200
- try:
201
- logger.debug(f"Searching for {top_k} documents" + (f" where: {where}" if where else ""))
202
-
203
- query = self.table.search(query_embedding)
204
- if where:
205
- query = query.where(where)
206
-
207
- results = query.limit(top_k).to_df()
208
- logger.debug(f"Found {len(results)} documents")
209
- return results.to_dict("records")
210
- except Exception as e:
211
- logger.error(f"Search failed: {e}")
212
- return []
213
-
214
- def count_documents(self) -> int:
215
- """
216
- Return total number of documents in table.
217
-
218
- Returns:
219
- Document count or 0 if table doesn't exist
220
- """
221
- if self.table is None:
222
- return 0
223
- return len(self.table)
224
-
225
- def get_statistics(self) -> Dict[str, Any]:
226
- """Get database statistics."""
227
- if self._db is None:
228
- self.connect()
229
-
230
- if self.table_name not in self._db.table_names():
231
- logger.warning(f"Table '{self.table_name}' does not exist yet")
232
- return {
233
- "total_documents": 0,
234
- "languages": {},
235
- "doc_types": {},
236
- "sources": [],
237
- "earliest_document": None,
238
- "latest_document": None,
239
- }
240
-
241
- try:
242
- table = self._db.open_table(self.table_name)
243
- df = table.to_pandas()
244
-
245
- stats = {
246
- "total_documents": len(df),
247
- "languages": df["language"].value_counts().to_dict() if "language" in df.columns else {},
248
- "doc_types": df["doc_type"].value_counts().to_dict() if "doc_type" in df.columns else {},
249
- "sources": df["source"].unique().tolist() if "source" in df.columns else [],
250
- "earliest_document": str(df["created_at"].min()) if "created_at" in df.columns else None,
251
- "latest_document": str(df["created_at"].max()) if "created_at" in df.columns else None,
252
- }
253
-
254
- logger.info(f"Database stats: {stats['total_documents']} documents")
255
- return stats
256
- except Exception as e:
257
- logger.error(f"Failed to get statistics: {e}")
258
- return {"error": str(e)}
259
-
260
-
261
- def get_recent_documents(self, limit: int = 20) -> List[Dict[str, Any]]:
262
- """
263
- Get recently added documents sorted by creation time.
264
-
265
- Args:
266
- limit: Maximum number of documents to return
267
-
268
- Returns:
269
- List of recent documents
270
- """
271
- if self.table is None:
272
- logger.warning(f"Table '{self.table_name}' doesn't exist")
273
- return []
274
-
275
- try:
276
- df = self.table.to_pandas()
277
- if "created_at" in df.columns:
278
- df = df.sort_values("created_at", ascending=False).head(limit)
279
- else:
280
- df = df.head(limit)
281
-
282
- return df.to_dict("records")
283
- except Exception as e:
284
- logger.error(f"Failed to get recent documents: {e}")
285
- return []
286
-
287
-
288
- def search_with_filters(
289
- self,
290
- query_embedding: List[float],
291
- language: Optional[str] = None,
292
- doc_type: Optional[str] = None,
293
- source: Optional[str] = None,
294
- top_k: int = 5
295
- ) -> List[Dict[str, Any]]:
296
- """
297
- Search with optional metadata filters.
298
-
299
- Args:
300
- query_embedding: Query vector embedding
301
- language: Filter by language code (e.g., 'en', 'pl')
302
- doc_type: Filter by document type (e.g., 'specification')
303
- source: Filter by source (e.g., 'wcag')
304
- top_k: Number of results to return
305
-
306
- Returns:
307
- List of matching documents
308
-
309
- Examples:
310
- >>> results = client.search_with_filters(
311
- ... embedding,
312
- ... language='pl',
313
- ... doc_type='specification',
314
- ... top_k=5
315
- ... )
316
- """
317
- if self.table is None:
318
- logger.warning(f"Table '{self.table_name}' doesn't exist")
319
- return []
320
-
321
- # Build where clause
322
- conditions = []
323
- if language:
324
- conditions.append(f"language = '{language}'")
325
- if doc_type:
326
- conditions.append(f"doc_type = '{doc_type}'")
327
- if source:
328
- conditions.append(f"source = '{source}'")
329
-
330
- where_clause = " AND ".join(conditions) if conditions else ""
331
-
332
- try:
333
- query = self.table.search(query_embedding)
334
- if where_clause:
335
- query = query.where(where_clause)
336
-
337
- results = query.limit(top_k).to_df()
338
- logger.debug(f"Found {len(results)} documents with filters")
339
- return results.to_dict("records")
340
- except Exception as e:
341
- logger.error(f"Search with filters failed: {e}")
342
- return []
343
-
344
- def close(self):
345
- """
346
- Close database connection and clean up resources.
347
-
348
- Call this method when shutting down the application to properly
349
- release all database resources and prevent asyncio warnings.
350
- """
351
- try:
352
- if self._db is not None:
353
- # LanceDB connections are file-based and don't need explicit closing
354
- # but we clear references to help garbage collection
355
- self._table = None
356
- self._db = None
357
- logger.info("VectorStoreClient resources cleared")
358
- except Exception as e:
359
- logger.warning(f"Error during VectorStoreClient cleanup: {e}")