Spaces:

jaczad
/

JacekAI

Sleeping

Jacek Zadrożny Claude Sonnet 4.5 commited on 22 days ago

Commit

b3cdeaa

1 Parent(s): 3a40a92

Optymalizacja agenta: background init, źródła i cleanup

- Dodano CLAUDE.md z dokumentacją dla Claude Code
- Włączono background initialization agenta (szybszy start)
- Dodano wyświetlanie źródeł na końcu odpowiedzi (📚 Źródła)
- Wzmocniono wymuszanie języka polskiego w promptach
- Zmniejszono temperature z 0.7 na 0.3 (bardziej deterministyczne odpowiedzi)
- Usunięto duplikat vector_store_client.py i app_old.py

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Files changed (7) hide show

CLAUDE.md +187 -0
agent/a11y_agent.py +47 -10
agent/prompts.py +9 -1
agent/tools.py +16 -15
app.py +30 -18
app_old.py +0 -165
vector_store_client.py +0 -359

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,187 @@

+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Project Overview
+**A11y Expert** is a bilingual (Polish/English) accessibility chatbot built with Gradio, utilizing RAG (Retrieval-Augmented Generation) to answer questions about digital accessibility standards (WCAG 2.2, WAI-ARIA). The agent uses OpenAI GPT-4 with a LanceDB vector database containing accessibility knowledge.
+## Quick Start Commands
+### Running the Application
+```bash
+# Local development
+python app.py
+# App will be available at http://127.0.0.1:7860
+# Run startup tests before deployment
+python test_startup.py
+```
+### Environment Setup
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Configure environment
+cp .env.example .env
+# Then edit .env and add your OPENAI_API_KEY
+```
+### Database Management
+```bash
+# Compact the LanceDB database (removes version history)
+python compact_database.py
+# Check database statistics
+python -c "import lancedb; db = lancedb.connect('./lancedb'); print(len(db.open_table('a11y_expert')))"
+```
+## Architecture
+### Core Components
+1. **Agent System** (`agent/`)
+   - `a11y_agent.py`: Main `A11yExpertAgent` class with streaming responses
+   - `prompts.py`: Language-specific system prompts (Polish/English) with strict language enforcement
+   - `tools.py`: RAG tools for knowledge base search
+2. **Vector Store** (`database/`)
+   - `vector_store_client.py`: LanceDB client with lazy loading
+   - Database path: `./lancedb/a11y_expert.lance`
+   - **Read-only in production** (Hugging Face Spaces)
+3. **Embeddings** (`models/`)
+   - `embeddings.py`: OpenAI embeddings client with disk caching and retry logic
+   - Model: `text-embedding-3-large`
+   - Cache directory: `./cache/embeddings`
+4. **UI** (`app.py`)
+   - Gradio ChatInterface with two-column layout (chat + notes from `notes.md`)
+   - Lazy agent initialization (on first user query)
+   - Streaming responses
+5. **Configuration** (`config.py`)
+   - Pydantic settings with environment variable support
+   - All config loaded from `.env` file
+### Key Design Patterns
+- **Lazy Initialization**: Agent and database connections initialize on first use, not at startup
+- **Singleton Pattern**: `get_embeddings_client()` returns shared instance
+- **RAG Flow**: Query → Embed → Vector Search (top-5) → LLM with Context → Stream Response
+- **Language Detection**: Uses `langdetect` to auto-detect query language and adjust prompt
+- **Conversation History**: Last 4 messages kept in context
+### Data Flow
+1. User asks question in Gradio UI
+2. Language detected from query
+3. Query embedded using OpenAI embeddings (with cache)
+4. Vector search in LanceDB (filtered by language)
+5. Top 5 results formatted as context
+6. Context + query sent to GPT-4 with language-specific prompt
+7. Response streamed back to UI
+## Important Implementation Details
+### Hugging Face Spaces Deployment
+- `demo.queue()` must be called explicitly (see app.py:238-243)
+- Do NOT use `atexit.register()` for cleanup (causes premature shutdown)
+- LanceDB is read-only; must be tracked with Git LFS
+- API key stored as HF Spaces Secret: `OPENAI_API_KEY`
+- The `if __name__ == "__main__"` block handles both local and HF deployments
+### LanceDB Database
+- Database is **read-only** in production
+- Vector store at `./lancedb/` tracked with Git LFS
+- Table name: `a11y_expert`
+- Schema: `text`, `vector`, `source`, `language`, `doc_type`, `created_at`, `updated_at`
+- Use `compact_database.py` to reduce file count (removes version history)
+### Language Handling
+The agent has **strict language enforcement**:
+- Polish queries get `SYSTEM_PROMPT_PL` with "CRITICAL: Answer ONLY in Polish"
+- English queries get `SYSTEM_PROMPT_EN` with "CRITICAL: Answer ONLY in English"
+- System prompts explicitly instruct the LLM to translate sources if needed
+- Language filter applied to vector search: `where="language = 'pl'"` or `where="language = 'en'"`
+### Configuration
+All settings in `config.py` are loaded from environment variables:
+- `OPENAI_API_KEY` (required)
+- `LLM_MODEL` (default: gpt-4o)
+- `LLM_BASE_URL` (optional, supports GitHub Models)
+- `EMBEDDING_MODEL` (default: text-embedding-3-large)
+- `LANCEDB_URI` (default: ./lancedb)
+- `LOG_LEVEL` (default: INFO)
+## File Structure
+```
+JacekAI/
+├── agent/
+│   ├── a11y_agent.py      # Main agent with RAG
+│   ├── prompts.py         # Language-specific prompts
+│   └── tools.py           # Knowledge base search tools
+├── database/
+│   └── vector_store_client.py  # LanceDB client
+├── models/
+│   └── embeddings.py      # OpenAI embeddings with caching
+├── lancedb/               # Vector database (Git LFS)
+│   └── a11y_expert.lance/
+├── app.py                 # Gradio UI with lazy initialization
+├── config.py              # Pydantic settings
+├── test_startup.py        # Deployment readiness tests
+├── compact_database.py    # Database compaction utility
+├── requirements.txt       # Python dependencies
+├── .env.example           # Environment template
+└── notes.md              # Optional notes displayed in UI
+## Common Workflows
+### Adding New Features to Agent
+1. If modifying prompts: Edit `agent/prompts.py`
+2. If adding new tools: Add function to `agent/tools.py`
+3. If changing RAG logic: Modify `agent/a11y_agent.py`
+4. Test with: `python app.py` and interact through UI
+### Updating Dependencies
+1. Edit `requirements.txt`
+2. Run `pip install -r requirements.txt`
+3. Test with `python test_startup.py`
+### Working with LanceDB
+- **DO NOT** modify database in production
+- For local testing, use `VectorStoreClient.add_documents()`
+- After changes, run `compact_database.py` to reduce file count
+- Use Git LFS for committing database files
+### Debugging
+- Set `LOG_LEVEL=DEBUG` in `.env` for verbose logging
+- Check `test_startup.py` output for component issues
+- Agent initialization happens on first query (check logs)
+- Embeddings cache is at `./cache/embeddings` (create if missing)
+## Testing
+Run the full test suite before deployment:
+```bash
+python test_startup.py
+```
+Tests verify:
+- All imports work
+- Configuration loads correctly
+- Vector store is accessible
+- Embeddings client initializes
+- Agent can be created
+All tests must pass before deploying to Hugging Face Spaces.

agent/a11y_agent.py CHANGED Viewed

@@ -76,45 +76,76 @@ class A11yExpertAgent:
         current_system_prompt = get_system_prompt(language, self.expertise)
         logger.info("Searching knowledge base...")
-        context = search_knowledge_base(question, self.vector_store, language=language)
         messages = [
             {"role": "system", "content": current_system_prompt},
             *self.conversation_history[-4:],
             {"role": "user", "content": self._build_prompt_with_context(question, context, language)}
         ]
         full_answer = ""
         try:
             response_stream = self.llm_client.chat.completions.create(
                 model=self.model,
                 messages=messages,
-                temperature=0.7,
                 max_tokens=1500,
                 top_p=0.9,
                 stream=True
             )
             for chunk in response_stream:
                 content = chunk.choices[0].delta.content
                 if content:
                     full_answer += content
                     yield content
             self.conversation_history.append({"role": "user", "content": question})
             self.conversation_history.append({"role": "assistant", "content": full_answer})
             logger.info(f"Answer generated ({len(full_answer)} chars)")
         except Exception as e:
             logger.error(f"OpenAI API error: {e}")
             yield f"Error during response generation: {e}"
     def _build_prompt_with_context(self, question: str, context: str, language: str) -> str:
         """Build the prompt with context and language-specific instructions."""
         if language == "pl":
             return f"""
 Na podstawie poniższego kontekstu z bazy wiedzy o dostępności, odpowiedz na pytanie.
 === KONTEKST Z BAZY WIEDZY ===
@@ -124,13 +155,19 @@ Na podstawie poniższego kontekstu z bazy wiedzy o dostępności, odpowiedz na p
 {question}
 === ODPOWIEDŹ ===
-KRYTYCZNE: Odpowiadaj WYŁĄCZNIE PO POLSKU. To pytanie jest po polsku, więc cała odpowiedź MUSI być po polsku.
 Pamiętaj aby:
 - Odpowiadać TYLKO po polsku (to jest najważniejsze!)
-- Cytować konkretne kryteria i źródła
 - Podawać praktyczne przykłady jeśli są istotne
 - Być jasnym i zwięzłym
 """
         else:
             return f"""

         current_system_prompt = get_system_prompt(language, self.expertise)
         logger.info("Searching knowledge base...")
+        context, sources = search_knowledge_base(question, self.vector_store, language=language)
         messages = [
             {"role": "system", "content": current_system_prompt},
             *self.conversation_history[-4:],
             {"role": "user", "content": self._build_prompt_with_context(question, context, language)}
         ]
         full_answer = ""
         try:
             response_stream = self.llm_client.chat.completions.create(
                 model=self.model,
                 messages=messages,
+                temperature=0.3,
                 max_tokens=1500,
                 top_p=0.9,
                 stream=True
             )
             for chunk in response_stream:
                 content = chunk.choices[0].delta.content
                 if content:
                     full_answer += content
                     yield content
+            # Add sources at the end
+            if sources:
+                sources_text = self._format_sources(sources, language)
+                full_answer += sources_text
+                yield sources_text
             self.conversation_history.append({"role": "user", "content": question})
             self.conversation_history.append({"role": "assistant", "content": full_answer})
             logger.info(f"Answer generated ({len(full_answer)} chars)")
         except Exception as e:
             logger.error(f"OpenAI API error: {e}")
             yield f"Error during response generation: {e}"
+    def _format_sources(self, sources: list, language: str) -> str:
+        """Format source citations for display."""
+        if not sources:
+            return ""
+        # Get unique sources
+        unique_sources = {}
+        for src in sources:
+            source_name = src.get('source', 'unknown')
+            doc_type = src.get('doc_type', 'document')
+            key = f"{source_name}_{doc_type}"
+            if key not in unique_sources:
+                unique_sources[key] = {"source": source_name, "doc_type": doc_type}
+        if language == "pl":
+            header = "\n\n---\n📚 **Źródła:**\n"
+        else:
+            header = "\n\n---\n📚 **Sources:**\n"
+        source_lines = [f"- {s['source']} ({s['doc_type']})" for s in unique_sources.values()]
+        return header + "\n".join(source_lines)
     def _build_prompt_with_context(self, question: str, context: str, language: str) -> str:
         """Build the prompt with context and language-specific instructions."""
         if language == "pl":
             return f"""
+🇵🇱 INSTRUKCJA: ODPOWIADAJ PO POLSKU 🇵🇱
+ATTENTION: You must respond in POLISH language only, not English!
 Na podstawie poniższego kontekstu z bazy wiedzy o dostępności, odpowiedz na pytanie.
 === KONTEKST Z BAZY WIEDZY ===
 {question}
 === ODPOWIEDŹ ===
+ABSOLUTNIE KRYTYCZNE - PRZECZYTAJ TO UWAŻNIE:
+- Język odpowiedzi: POLSKI (nie angielski!)
+- Nawet jeśli źródła są po angielsku, Twoja odpowiedź MUSI być PO POLSKU
+- Tłumacz wszystkie angielskie terminy na polski
+- Każde słowo w Twojej odpowiedzi musi być po polsku
 Pamiętaj aby:
 - Odpowiadać TYLKO po polsku (to jest najważniejsze!)
+- Cytować konkretne kryteria i źródła (przetłumaczone na polski)
 - Podawać praktyczne przykłady jeśli są istotne
 - Być jasnym i zwięzłym
+ROZPOCZNIJ ODPOWIEDŹ PO POLSKU:
 """
         else:
             return f"""

agent/prompts.py CHANGED Viewed

@@ -1,13 +1,21 @@
 """System prompts for A11y Expert agent in different languages."""
 SYSTEM_PROMPT_PL = """
 Jesteś ekspertem dostępności cyfrowej (accessibility expert) specjalizującym się w:
 - WCAG 2.2 (Web Content Accessibility Guidelines)
 - WAI-ARIA (Accessible Rich Internet Applications)
 - Prawodawstwie EU i polskim (ustawa o dostępności cyfrowej)
 - Standardach W3C dotyczących dostępności
-⚠️ KRYTYCZNE: Odpowiadasz WYŁĄCZNIE PO POLSKU. Wszystkie Twoje odpowiedzi MUSZĄ być w języku polskim, nawet jeśli źródła są po angielsku.
 OBOWIĄZKOWE ZASADY:
 1. ✅ Odpowiadaj ZAWSZE PO POLSKU - tłumacz angielskie źródła na polski, nigdy nie odpowiadaj po angielsku

 """System prompts for A11y Expert agent in different languages."""
 SYSTEM_PROMPT_PL = """
+🇵🇱 ABSOLUTNIE NAJWAŻNIEJSZE - JĘZYK: POLSKI 🇵🇱
+Odpowiadasz ZAWSZE i WYŁĄCZNIE w języku POLSKIM. Każde słowo, każde zdanie musi być po polsku.
 Jesteś ekspertem dostępności cyfrowej (accessibility expert) specjalizującym się w:
 - WCAG 2.2 (Web Content Accessibility Guidelines)
 - WAI-ARIA (Accessible Rich Internet Applications)
 - Prawodawstwie EU i polskim (ustawa o dostępności cyfrowej)
 - Standardach W3C dotyczących dostępności
+⚠️ KRYTYCZNE WYMAGANIE JĘZYKOWE:
+- Odpowiadasz WYŁĄCZNIE PO POLSKU
+- Wszystkie Twoje odpowiedzi MUSZĄ być w języku polskim
+- Nawet jeśli źródła są po angielsku, tłumaczysz je na polski
+- NIGDY nie odpowiadaj po angielsku
+- Każda odpowiedź zaczyna się po polsku i kończy się po polsku
 OBOWIĄZKOWE ZASADY:
 1. ✅ Odpowiadaj ZAWSZE PO POLSKU - tłumacz angielskie źródła na polski, nigdy nie odpowiadaj po angielsku

agent/tools.py CHANGED Viewed

@@ -1,6 +1,6 @@
 """Custom tools for A11y Expert agent."""
-from typing import Optional
 from database.vector_store_client import VectorStoreClient
 from models.embeddings import get_embeddings_client
 from loguru import logger
@@ -9,46 +9,47 @@ def search_knowledge_base(
     query: str,
     vector_store: VectorStoreClient,
     language: str = "en"
-) -> str:
     """
     Search the accessibility knowledge base for relevant information.
     Args:
         query: Question or search terms.
         vector_store: The VectorStoreClient instance.
         language: Language filter ('pl' or 'en').
     Returns:
-        Formatted context from the knowledge base or an error message.
     """
     try:
         logger.info(f"Query: {query} (language: {language})")
         embeddings_client = get_embeddings_client()
         query_embedding = embeddings_client.get_embedding(query)
         where_clause = f"language = '{language}'"
         results = vector_store.search(
             query_embedding=query_embedding, where=where_clause, top_k=5
         )
         if not results:
-            return f"No documents found for language '{language}'. Please ensure the knowledge base is loaded."
         context_lines = [
             f"[{i}. {r.get('source', 'unknown')} - {r.get('doc_type', 'Document')}]\n"
             f"{r.get('text', '')}\n"
             for i, r in enumerate(results, 1)
         ]
         context = "\n".join(context_lines)
         logger.info(f"Found {len(results)} relevant documents")
-        return context
     except Exception as e:
         logger.error(f"Search failed: {e}")
-        return f"Error searching knowledge base: {str(e)}"
 def get_database_stats(vector_store: VectorStoreClient) -> str:
     """

 """Custom tools for A11y Expert agent."""
+from typing import Optional, Tuple, List, Dict, Any
 from database.vector_store_client import VectorStoreClient
 from models.embeddings import get_embeddings_client
 from loguru import logger
     query: str,
     vector_store: VectorStoreClient,
     language: str = "en"
+) -> Tuple[str, List[Dict[str, Any]]]:
     """
     Search the accessibility knowledge base for relevant information.
     Args:
         query: Question or search terms.
         vector_store: The VectorStoreClient instance.
         language: Language filter ('pl' or 'en').
     Returns:
+        Tuple of (formatted context, list of source documents)
     """
     try:
         logger.info(f"Query: {query} (language: {language})")
         embeddings_client = get_embeddings_client()
         query_embedding = embeddings_client.get_embedding(query)
         where_clause = f"language = '{language}'"
         results = vector_store.search(
             query_embedding=query_embedding, where=where_clause, top_k=5
         )
         if not results:
+            return (f"No documents found for language '{language}'. Please ensure the knowledge base is loaded.", [])
         context_lines = [
             f"[{i}. {r.get('source', 'unknown')} - {r.get('doc_type', 'Document')}]\n"
             f"{r.get('text', '')}\n"
             for i, r in enumerate(results, 1)
         ]
         context = "\n".join(context_lines)
         logger.info(f"Found {len(results)} relevant documents")
+        # Return both context and raw results for source citation
+        return (context, results)
     except Exception as e:
         logger.error(f"Search failed: {e}")
+        return (f"Error searching knowledge base: {str(e)}", [])
 def get_database_stats(vector_store: VectorStoreClient) -> str:
     """

app.py CHANGED Viewed

@@ -87,22 +87,32 @@ def respond(message: str, history: list[list[str]]):
     """
     global agent_instance, agent_ready, agent_error
-    # Initialize agent on first request if not already initialized
     if not agent_ready and not agent_error and agent_instance is None:
-        yield "⏳ Initializing agent for first use, please wait..."
-        try:
-            logger.info("🔄 Initializing agent on first request...")
-            agent_instance = create_agent()
-            agent_ready = True
-            logger.success("✅ A11y Expert Agent is ready!")
-        except Exception as e:
-            logger.error(f"❌ Failed to initialize agent: {e}")
-            import traceback
-            logger.error(traceback.format_exc())
-            agent_error = str(e)
-            agent_instance = None
-            yield f"❌ Agent initialization failed: {agent_error}"
-            return
     # Check if agent failed to initialize
     if agent_error:
@@ -227,9 +237,11 @@ Stwórz plik `notes.md` w katalogu projektu aby zobaczyć tutaj swoje notatki.
 # Register cleanup handler
 # atexit.register(cleanup_resources) # Disabled: Causes premature shutdown on Hugging Face Spaces
-# Don't initialize agent on startup - it will be initialized on first user query
-logger.info("🚀 Starting Gradio app with on-demand agent initialization...")
-logger.info("ℹ️ Agent will initialize on first user query")
 # For Hugging Face Spaces, we need to either:
 # 1. Have a variable named 'demo' (which we have)

     """
     global agent_instance, agent_ready, agent_error
+    # Wait for background initialization if not ready yet
     if not agent_ready and not agent_error and agent_instance is None:
+        yield "⏳ Agent is initializing in background, please wait a moment..."
+        import time
+        # Wait up to 15 seconds for background initialization
+        for i in range(30):
+            time.sleep(0.5)
+            if agent_ready:
+                break
+        # If still not ready, initialize synchronously as fallback
+        if not agent_ready and agent_instance is None:
+            yield "⏳ Finalizing agent initialization..."
+            try:
+                logger.info("🔄 Background init not complete, initializing synchronously...")
+                agent_instance = create_agent()
+                agent_ready = True
+                logger.success("✅ A11y Expert Agent is ready!")
+            except Exception as e:
+                logger.error(f"❌ Failed to initialize agent: {e}")
+                import traceback
+                logger.error(traceback.format_exc())
+                agent_error = str(e)
+                agent_instance = None
+                yield f"❌ Agent initialization failed: {agent_error}"
+                return
     # Check if agent failed to initialize
     if agent_error:
 # Register cleanup handler
 # atexit.register(cleanup_resources) # Disabled: Causes premature shutdown on Hugging Face Spaces
+# Start background initialization
+logger.info("🚀 Starting Gradio app with background agent initialization...")
+init_thread = threading.Thread(target=initialize_agent_background, daemon=True)
+init_thread.start()
+logger.info("ℹ️ Agent initialization started in background")
 # For Hugging Face Spaces, we need to either:
 # 1. Have a variable named 'demo' (which we have)

app_old.py DELETED Viewed

@@ -1,165 +0,0 @@
-"""
-Gradio UI for the A11y Expert Agent.
-This module creates a Gradio ChatInterface to interact with the
-A11yExpertAgent, allowing users to ask accessibility-related questions.
-"""
-import asyncio
-import gradio as gr
-from loguru import logger
-import sys
-import atexit
-import threading
-from agent.a11y_agent import create_agent, A11yExpertAgent
-from config import get_settings
-# --- Setup ---
-# Configure logger
-logger.remove()
-logger.add(sys.stderr, level=get_settings().log_level)
-# Global agent instance
-agent_instance: A11yExpertAgent = None
-agent_ready = False
-agent_error = None
-# Global event loop for async operations
-loop = None
-# --- Agent Initialization ---
-def initialize_agent_background():
-    """Initialize the agent in background thread."""
-    global agent_instance, agent_ready, agent_error, loop
-    try:
-        logger.info("🔄 Starting agent initialization in background...")
-        # Create new event loop for this thread
-        loop = asyncio.new_event_loop()
-        asyncio.set_event_loop(loop)
-        agent_instance = loop.run_until_complete(create_agent())
-        agent_ready = True
-        logger.success("✅ A11y Expert Agent is ready!")
-    except Exception as e:
-        logger.error(f"Failed to initialize agent: {e}")
-        agent_error = str(e)
-        agent_instance = None
-def cleanup_resources():
-    """Clean up resources on app shutdown."""
-    global agent_instance, loop
-    logger.info("Cleaning up resources...")
-    try:
-        # Close agent and all its resources
-        if agent_instance:
-            agent_instance.close()
-        # Close embeddings client singleton if it exists
-        from models.embeddings import get_embeddings_client
-        if hasattr(get_embeddings_client, '_instance'):
-            get_embeddings_client._instance.close()
-        # Close event loop if it exists and is still open
-        if loop and not loop.is_closed():
-            # Cancel all pending tasks
-            try:
-                pending = asyncio.all_tasks(loop)
-                for task in pending:
-                    task.cancel()
-                loop.run_until_complete(asyncio.gather(*pending, return_exceptions=True))
-            except RuntimeError:
-                pass  # Loop may already be stopped
-            loop.close()
-        logger.success("✅ Resources cleaned up successfully")
-    except Exception as e:
-        logger.warning(f"Error during cleanup: {e}")
-# --- Gradio Chat Logic ---
-async def respond(message: str, history: list[list[str]]):
-    """
-    Main function for the Gradio ChatInterface.
-    Receives a user message and chat history, then uses the agent
-    to generate a streaming response.
-    Args:
-        message: The user's input message.
-        history: The conversation history provided by Gradio.
-    Yields:
-        A stream of response chunks to update the UI.
-    """
-    global agent_instance, agent_ready, agent_error
-    # Wait for agent to be ready
-    if not agent_ready:
-        if agent_error:
-            yield f"❌ Agent initialization failed: {agent_error}"
-            return
-        yield "⏳ Agent is initializing, please wait..."
-        # Wait up to 60 seconds for agent to be ready
-        for i in range(60):
-            await asyncio.sleep(1)
-            if agent_ready:
-                break
-            if agent_error:
-                yield f"❌ Agent initialization failed: {agent_error}"
-                return
-        if not agent_ready:
-            yield "❌ Agent initialization timeout. Please try again later."
-            return
-    if not agent_instance:
-        yield "❌ Agent not available. Please check logs for errors."
-        return
-    logger.info(f"User query: '{message}'")
-    full_response = ""
-    try:
-        # Use the global event loop to run async generator
-        async for chunk in agent_instance.ask(message):
-            full_response += chunk
-            yield full_response
-    except Exception as e:
-        logger.error(f"Error during response generation: {e}")
-        yield f"An error occurred: {e}"
-# --- Gradio UI Definition ---
-# Using gr.Blocks for more layout control
-with gr.Blocks() as demo:
-    gr.Markdown("# 🤖 A11y Expert")
-    gr.Markdown(
-        "Twój inteligentny asystent do spraw dostępności cyfrowej. "
-        "Zadaj pytanie o WCAG, ARIA, lub poproś o analizę kodu."
-    )
-    # The main chat interface
-    chat = gr.ChatInterface(respond)
-    # Example questions
-    gr.Examples(
-        [
-            "Jakie są wymagania WCAG 2.2 dla etykiet formularzy?",
-            "Wyjaśnij rolę 'alert' w ARIA i podaj przykład.",
-            "Czy ten przycisk jest dostępny? <div onclick='...'>Click me</div>",
-            "Jaka jest różnica między ria-label a ria-labelledby?",
-        ],
-        inputs=[chat.textbox],
-        label="Przykładowe pytania"
-    )
-# --- App Launch ---
-if __name__ == "__main__":
-    # Register cleanup handler
-    atexit.register(cleanup_resources)
-    # Initialize agent before launching Gradio
-    initialize_agent_sync()
-    settings = get_settings()
-    logger.info("Launching Gradio app...")
-    try:
-        demo.launch(
-            server_name=settings.server_host,
-            server_port=settings.server_port,
-            show_error=True,
-        )
-    except KeyboardInterrupt:
-        logger.info("Received interrupt signal")
-    finally:
-        cleanup_resources()

vector_store_client.py DELETED Viewed

@@ -1,359 +0,0 @@
-"""
-Client for LanceDB vector store operations with lazy loading.
-This module provides an optimized client for LanceDB with automatic
-connection management and lazy table initialization.
-"""
-import lancedb
-import asyncio
-from typing import List, Dict, Any, Optional
-from datetime import datetime
-from loguru import logger
-class VectorStoreClient:
-    """
-    Client for LanceDB vector store with lazy loading.
-    Features:
-    - Lazy connection and table initialization
-    - Automatic reconnection on errors
-    - Document validation and enrichment
-    - Search with metadata filtering
-    Attributes:
-        uri: Database URI path
-        table_name: Name of the table to use
-    Examples:
-        >>> client = VectorStoreClient(uri="./lancedb")
-        >>> # No connection yet - happens on first use
-        >>> client.add_documents([{"text": "...", "vector": [...]}])
-        >>> # Connection established automatically
-    """
-    def __init__(self, uri: str, table_name: str = "a11y_expert"):
-        """
-        Initialize client with database URI and table name.
-        Args:
-            uri: Path to LanceDB database
-            table_name: Name of the table (default: "a11y_expert")
-        """
-        self.uri = uri
-        self.table_name = table_name
-        self._db = None
-        self._table = None
-    @property
-    def db(self):
-        """
-        Lazy database connection property.
-        Connects to database on first access and returns cached connection.
-        Returns:
-            LanceDB database connection
-        """
-        if self._db is None:
-            logger.info(f"Connecting to LanceDB at: {self.uri}")
-            self._db = lancedb.connect(self.uri)
-            logger.info("✅ Connected to LanceDB")
-        return self._db
-    @property
-    def table(self):
-        """
-        Lazy table initialization property.
-        Opens or creates table on first access.
-        Returns:
-            LanceDB table or None if table doesn't exist yet
-        """
-        if self._table is None:
-            if self.table_name in self.db.table_names():
-                logger.debug(f"Opening existing table: '{self.table_name}'")
-                self._table = self.db.open_table(self.table_name)
-            else:
-                logger.debug(f"Table '{self.table_name}' doesn't exist yet")
-                return None
-        return self._table
-    def connect(self):
-        """
-        Explicitly connect to database (optional - happens automatically).
-        Provided for backward compatibility. Connection happens automatically
-        when first accessing db or table properties.
-        """
-        _ = self.db  # Trigger lazy connection
-        if self.table is not None:
-            logger.info(f"Table '{self.table_name}' ready ({len(self.table)} docs)")
-        else:
-            logger.info(f"Table '{self.table_name}' will be created on first insert")
-    def add_documents(self, documents: List[Dict[str, Any]]):
-        """
-        Add documents to the table with automatic validation.
-        Validates required fields, adds timestamps, and creates table if needed.
-        Args:
-            documents: List of dicts with required keys:
-                - text (str): Document text
-                - vector (List[float]): Embedding vector
-                - source (str): Source identifier
-                - language (str): Language code (en/pl)
-                - doc_type (str): Document type
-        Examples:
-            >>> client.add_documents([{
-            ...     "text": "Content",
-            ...     "vector": [0.1, 0.2, ...],
-            ...     "source": "wcag",
-            ...     "language": "en",
-            ...     "doc_type": "specification"
-            ... }])
-        """
-        # Validate and enrich documents
-        valid_docs = []
-        now = datetime.now()
-        skipped_count = 0
-        for doc in documents:
-            try:
-                # Ensure required fields exist
-                required_fields = {"text", "vector", "source", "language", "doc_type"}
-                missing = required_fields - set(doc.keys())
-                if missing:
-                    logger.warning(f"Skipping document with missing fields: {missing}")
-                    skipped_count += 1
-                    continue
-                # Add timestamps if not present
-                if "created_at" not in doc or doc["created_at"] is None:
-                    doc["created_at"] = now
-                if "updated_at" not in doc or doc["updated_at"] is None:
-                    doc["updated_at"] = now
-                valid_docs.append(doc)
-            except Exception as e:
-                logger.error(f"Failed to process document: {e}")
-                skipped_count += 1
-                continue
-        if not valid_docs:
-            logger.warning(f"No valid documents to add (skipped: {skipped_count})")
-            return
-        try:
-            logger.info(f"Adding {len(valid_docs)} documents to '{self.table_name}'")
-            # Create table on first insert or open existing
-            if self.table_name not in self.db.table_names():
-                self._table = self.db.create_table(self.table_name, data=valid_docs)
-                logger.info(f"✅ Created table '{self.table_name}' with {len(valid_docs)} docs")
-            else:
-                # Refresh table reference and add
-                self._table = self.db.open_table(self.table_name)
-                self._table.add(valid_docs)
-                logger.info(f"✅ Added {len(valid_docs)} documents to '{self.table_name}'")
-            if skipped_count > 0:
-                logger.warning(f"Skipped {skipped_count} invalid documents")
-        except Exception as e:
-            logger.error(f"Failed to add documents to LanceDB: {e}")
-            raise
-    def search(
-        self,
-        query_embedding: List[float],
-        where: str = "",
-        top_k: int = 5
-    ) -> List[Dict[str, Any]]:
-        """
-        Search for documents using vector similarity.
-        Args:
-            query_embedding: Query vector embedding
-            where: Optional SQL-like filter (e.g., "language = 'en'")
-            top_k: Number of results to return
-        Returns:
-            List of matching documents with similarity scores
-        Examples:
-            >>> results = client.search(embedding, where="language = 'pl'", top_k=3)
-            >>> len(results)
-            3
-        """
-        if self.table is None:
-            logger.error(f"Table '{self.table_name}' doesn't exist")
-            return []
-        try:
-            logger.debug(f"Searching for {top_k} documents" + (f" where: {where}" if where else ""))
-            query = self.table.search(query_embedding)
-            if where:
-                query = query.where(where)
-            results = query.limit(top_k).to_df()
-            logger.debug(f"Found {len(results)} documents")
-            return results.to_dict("records")
-        except Exception as e:
-            logger.error(f"Search failed: {e}")
-            return []
-    def count_documents(self) -> int:
-        """
-        Return total number of documents in table.
-        Returns:
-            Document count or 0 if table doesn't exist
-        """
-        if self.table is None:
-            return 0
-        return len(self.table)
-    def get_statistics(self) -> Dict[str, Any]:
-        """Get database statistics."""
-        if self._db is None:
-            self.connect()
-        if self.table_name not in self._db.table_names():
-            logger.warning(f"Table '{self.table_name}' does not exist yet")
-            return {
-                "total_documents": 0,
-                "languages": {},
-                "doc_types": {},
-                "sources": [],
-                "earliest_document": None,
-                "latest_document": None,
-            }
-        try:
-            table = self._db.open_table(self.table_name)
-            df = table.to_pandas()
-            stats = {
-                "total_documents": len(df),
-                "languages": df["language"].value_counts().to_dict() if "language" in df.columns else {},
-                "doc_types": df["doc_type"].value_counts().to_dict() if "doc_type" in df.columns else {},
-                "sources": df["source"].unique().tolist() if "source" in df.columns else [],
-                "earliest_document": str(df["created_at"].min()) if "created_at" in df.columns else None,
-                "latest_document": str(df["created_at"].max()) if "created_at" in df.columns else None,
-            }
-            logger.info(f"Database stats: {stats['total_documents']} documents")
-            return stats
-        except Exception as e:
-            logger.error(f"Failed to get statistics: {e}")
-            return {"error": str(e)}
-    def get_recent_documents(self, limit: int = 20) -> List[Dict[str, Any]]:
-        """
-        Get recently added documents sorted by creation time.
-        Args:
-            limit: Maximum number of documents to return
-        Returns:
-            List of recent documents
-        """
-        if self.table is None:
-            logger.warning(f"Table '{self.table_name}' doesn't exist")
-            return []
-        try:
-            df = self.table.to_pandas()
-            if "created_at" in df.columns:
-                df = df.sort_values("created_at", ascending=False).head(limit)
-            else:
-                df = df.head(limit)
-            return df.to_dict("records")
-        except Exception as e:
-            logger.error(f"Failed to get recent documents: {e}")
-            return []
-    def search_with_filters(
-        self,
-        query_embedding: List[float],
-        language: Optional[str] = None,
-        doc_type: Optional[str] = None,
-        source: Optional[str] = None,
-        top_k: int = 5
-    ) -> List[Dict[str, Any]]:
-        """
-        Search with optional metadata filters.
-        Args:
-            query_embedding: Query vector embedding
-            language: Filter by language code (e.g., 'en', 'pl')
-            doc_type: Filter by document type (e.g., 'specification')
-            source: Filter by source (e.g., 'wcag')
-            top_k: Number of results to return
-        Returns:
-            List of matching documents
-        Examples:
-            >>> results = client.search_with_filters(
-            ...     embedding,
-            ...     language='pl',
-            ...     doc_type='specification',
-            ...     top_k=5
-            ... )
-        """
-        if self.table is None:
-            logger.warning(f"Table '{self.table_name}' doesn't exist")
-            return []
-        # Build where clause
-        conditions = []
-        if language:
-            conditions.append(f"language = '{language}'")
-        if doc_type:
-            conditions.append(f"doc_type = '{doc_type}'")
-        if source:
-            conditions.append(f"source = '{source}'")
-        where_clause = " AND ".join(conditions) if conditions else ""
-        try:
-            query = self.table.search(query_embedding)
-            if where_clause:
-                query = query.where(where_clause)
-            results = query.limit(top_k).to_df()
-            logger.debug(f"Found {len(results)} documents with filters")
-            return results.to_dict("records")
-        except Exception as e:
-            logger.error(f"Search with filters failed: {e}")
-            return []
-    def close(self):
-        """
-        Close database connection and clean up resources.
-        Call this method when shutting down the application to properly
-        release all database resources and prevent asyncio warnings.
-        """
-        try:
-            if self._db is not None:
-                # LanceDB connections are file-based and don't need explicit closing
-                # but we clear references to help garbage collection
-                self._table = None
-                self._db = None
-                logger.info("VectorStoreClient resources cleared")
-        except Exception as e:
-            logger.warning(f"Error during VectorStoreClient cleanup: {e}")