Jacek Zadrożny commited on
Commit
5fb63e2
·
1 Parent(s): 27d9eb1

Aktualizacja ostatniej szansy

Browse files
.gitignore CHANGED
@@ -1 +1,46 @@
1
- "cache/"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ "# Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ build/
8
+ develop-eggs/
9
+ dist/
10
+ downloads/
11
+ eggs/
12
+ .eggs/
13
+ lib/
14
+ lib64/
15
+ parts/
16
+ sdist/
17
+ var/
18
+ wheels/
19
+ *.egg-info/
20
+ .installed.cfg
21
+ *.egg
22
+
23
+ # Virtual Environment
24
+ venv/
25
+ env/
26
+ ENV/
27
+
28
+ # Cache and temporary files
29
+ cache/
30
+ *.log
31
+ *.cache
32
+
33
+ # IDE
34
+ .vscode/
35
+ .idea/
36
+ *.swp
37
+ *.swo
38
+ *~
39
+
40
+ # Environment variables
41
+ .env
42
+ .env.local
43
+
44
+ # OS
45
+ .DS_Store
46
+ Thumbs.db"
README.md CHANGED
@@ -1,13 +1,86 @@
1
  ---
2
- title: JacekAI
3
- emoji: 🤖
4
- colorFrom: yellow
5
- colorTo: purple
6
  sdk: gradio
7
- python_version: 3.12
 
8
  app_file: app.py
9
- pinned: false
10
- short_description: Chatbot wyspecjalizowany w cyfrowej dostępności.
11
  ---
12
 
13
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: JacekAI - A11y Expert
3
+ emoji:
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: 6.1.0
8
+ python_version: 3.10
9
  app_file: app.py
10
+ pinned: true
11
+ short_description: Inteligentny asystent do spraw dostępności cyfrowej (WCAG, ARIA)
12
  ---
13
 
14
+ # 🤖 A11y Expert - Asystent Dostępności Cyfrowej
15
+
16
+ Inteligentny agent AI wyspecjalizowany w dostępności cyfrowej, wykorzystujący RAG (Retrieval-Augmented Generation) z bazą wiedzy WCAG, ARIA i najlepszych praktyk.
17
+
18
+ ## ✨ Funkcje
19
+
20
+ - 💬 **Rozmowa w języku polskim i angielskim** - automatyczna detekcja języka
21
+ - 📚 **Baza wiedzy** - WCAG 2.2, ARIA, i praktyczne przykłady
22
+ - 🔍 **RAG** - odpowiedzi oparte na oficjalnej dokumentacji
23
+ - 🎯 **Specjalistyczne odpowiedzi** - cytowanie kryteriów i źródeł
24
+ - ⚡ **Streaming** - płynne generowanie odpowiedzi
25
+
26
+ ## 🚀 Jak używać
27
+
28
+ 1. Wpisz pytanie o dostępność cyfrową
29
+ 2. Zadaj pytanie po polsku lub angielsku
30
+ 3. Otrzymaj szczegółową odpowiedź z cytowaniem źródeł
31
+
32
+ **Przykładowe pytania:**
33
+ - "Jakie są wymagania WCAG 2.2 dla etykiet formularzy?"
34
+ - "Wyjaśnij rolę 'alert' w ARIA i podaj przykład"
35
+ - "Czy ten przycisk jest dostępny? `<div onclick='...'>Click me</div>`"
36
+
37
+ ## 🔧 Technologie
38
+
39
+ - **Gradio** - interfejs użytkownika
40
+ - **OpenAI GPT-4** - model językowy
41
+ - **LanceDB** - wektorowa baza danych
42
+ - **RAG** - wyszukiwanie semantyczne w bazie wiedzy
43
+
44
+ ## 📝 Konfiguracja (dla developerów)
45
+
46
+ ### Zmienne środowiskowe
47
+ ```bash
48
+ OPENAI_API_KEY=sk-... # Wymagane
49
+ SERVER_HOST=0.0.0.0 # Dla Hugging Face Spaces
50
+ SERVER_PORT=7860 # Port Gradio
51
+ LOG_LEVEL=INFO # Poziom logowania
52
+ ```
53
+
54
+ ### Instalacja lokalna
55
+ ```bash
56
+ pip install -r requirements.txt
57
+ cp .env.example .env
58
+ # Ustaw OPENAI_API_KEY w .env
59
+ python app.py
60
+ ```
61
+
62
+ ### Test przed wdrożeniem
63
+ ```bash
64
+ python test_startup.py
65
+ ```
66
+
67
+ ## 📖 Dokumentacja
68
+
69
+ - [Deployment Guide](./README_DEPLOYMENT.md) - szczegółowy przewodnik wdrożeniowy
70
+ - [WCAG 2.2](https://www.w3.org/TR/WCAG22/) - oficjalna specyfikacja
71
+ - [ARIA](https://www.w3.org/TR/wai-aria/) - dostępne komponenty internetowe
72
+
73
+ ## 🐛 Rozwiązane problemy
74
+
75
+ ✅ Konflikty pętli zdarzeń asyncio
76
+ ✅ Brak czyszczenia zasobów przy shutdown
77
+ ✅ Konflikty wersji bibliotek (Pydantic 2.x)
78
+ ✅ Graceful shutdown na Hugging Face Spaces
79
+
80
+ ## 📄 Licencja
81
+
82
+ Ten projekt służy celom edukacyjnym. Baza wiedzy pochodzi z publicznych źródeł (W3C, MDN).
83
+
84
+ ## 👨‍💻 Autor
85
+
86
+ Stworzony z pomocą GitHub Copilot CLI
agent/__pycache__/__init__.cpython-312.pyc CHANGED
Binary files a/agent/__pycache__/__init__.cpython-312.pyc and b/agent/__pycache__/__init__.cpython-312.pyc differ
 
agent/a11y_agent.py CHANGED
@@ -42,6 +42,17 @@ class A11yExpertAgent:
42
 
43
  logger.info(f"A11yExpertAgent initialized (lang={language}, expertise={expertise})")
44
 
 
 
 
 
 
 
 
 
 
 
 
45
  async def ask(self, question: str) -> AsyncGenerator[str, None]:
46
  """
47
  Ask a question and get a streaming answer with RAG.
 
42
 
43
  logger.info(f"A11yExpertAgent initialized (lang={language}, expertise={expertise})")
44
 
45
+ def close(self):
46
+ """Close agent resources."""
47
+ try:
48
+ if self.vector_store:
49
+ self.vector_store.close()
50
+ if hasattr(self.llm_client, 'close'):
51
+ self.llm_client.close()
52
+ logger.info("A11yExpertAgent resources closed")
53
+ except Exception as e:
54
+ logger.warning(f"Error closing A11yExpertAgent: {e}")
55
+
56
  async def ask(self, question: str) -> AsyncGenerator[str, None]:
57
  """
58
  Ask a question and get a streaming answer with RAG.
app.py CHANGED
@@ -7,6 +7,7 @@ import asyncio
7
  import gradio as gr
8
  from loguru import logger
9
  import sys
 
10
  from agent.a11y_agent import create_agent, A11yExpertAgent
11
  from config import get_settings
12
  # --- Setup ---
@@ -15,17 +16,60 @@ logger.remove()
15
  logger.add(sys.stderr, level=get_settings().log_level)
16
  # Global agent instance
17
  agent_instance: A11yExpertAgent = None
 
 
 
18
  # --- Agent Initialization ---
19
- async def initialize_agent():
20
- """Initialize the agent asynchronously."""
21
- global agent_instance
22
  try:
23
  logger.info("Initializing A11y Expert Agent...")
24
- agent_instance = await create_agent()
 
 
 
 
 
 
 
 
 
 
25
  logger.success("✅ A11y Expert Agent is ready!")
26
  except Exception as e:
27
  logger.error(f"Failed to initialize agent: {e}")
28
  agent_instance = None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  # --- Gradio Chat Logic ---
30
  async def respond(message: str, history: list[list[str]]):
31
  """
@@ -38,7 +82,7 @@ async def respond(message: str, history: list[list[str]]):
38
  Yields:
39
  A stream of response chunks to update the UI.
40
  """
41
- global agent_instance
42
  if not agent_instance:
43
  yield "Agent not initialized. Please check logs for errors."
44
  return
@@ -46,7 +90,7 @@ async def respond(message: str, history: list[list[str]]):
46
  logger.info(f"User query: '{message}'")
47
  full_response = ""
48
  try:
49
- # Stream the response from the agent
50
  async for chunk in agent_instance.ask(message):
51
  full_response += chunk
52
  yield full_response
@@ -80,13 +124,22 @@ with gr.Blocks() as demo:
80
 
81
  # --- App Launch ---
82
  if __name__ == "__main__":
83
- # Initialize agent synchronously using asyncio.run() before launching Gradio
84
- # This avoids event loop conflicts with Gradio's own event loop
85
- asyncio.run(initialize_agent())
 
 
86
 
87
  settings = get_settings()
88
  logger.info("Launching Gradio app...")
89
- demo.launch(
90
- server_name=settings.server_host,
91
- server_port=settings.server_port,
92
- )
 
 
 
 
 
 
 
 
7
  import gradio as gr
8
  from loguru import logger
9
  import sys
10
+ import atexit
11
  from agent.a11y_agent import create_agent, A11yExpertAgent
12
  from config import get_settings
13
  # --- Setup ---
 
16
  logger.add(sys.stderr, level=get_settings().log_level)
17
  # Global agent instance
18
  agent_instance: A11yExpertAgent = None
19
+ # Global event loop for async operations
20
+ loop = None
21
+
22
  # --- Agent Initialization ---
23
+ def initialize_agent_sync():
24
+ """Initialize the agent synchronously (wrapper for async init)."""
25
+ global agent_instance, loop
26
  try:
27
  logger.info("Initializing A11y Expert Agent...")
28
+ # Use existing event loop if available, otherwise create new one
29
+ try:
30
+ loop = asyncio.get_event_loop()
31
+ if loop.is_closed():
32
+ loop = asyncio.new_event_loop()
33
+ asyncio.set_event_loop(loop)
34
+ except RuntimeError:
35
+ loop = asyncio.new_event_loop()
36
+ asyncio.set_event_loop(loop)
37
+
38
+ agent_instance = loop.run_until_complete(create_agent())
39
  logger.success("✅ A11y Expert Agent is ready!")
40
  except Exception as e:
41
  logger.error(f"Failed to initialize agent: {e}")
42
  agent_instance = None
43
+
44
+ def cleanup_resources():
45
+ """Clean up resources on app shutdown."""
46
+ global agent_instance, loop
47
+ logger.info("Cleaning up resources...")
48
+ try:
49
+ # Close agent and all its resources
50
+ if agent_instance:
51
+ agent_instance.close()
52
+
53
+ # Close embeddings client singleton if it exists
54
+ from models.embeddings import get_embeddings_client
55
+ if hasattr(get_embeddings_client, '_instance'):
56
+ get_embeddings_client._instance.close()
57
+
58
+ # Close event loop if it exists and is still open
59
+ if loop and not loop.is_closed():
60
+ # Cancel all pending tasks
61
+ try:
62
+ pending = asyncio.all_tasks(loop)
63
+ for task in pending:
64
+ task.cancel()
65
+ loop.run_until_complete(asyncio.gather(*pending, return_exceptions=True))
66
+ except RuntimeError:
67
+ pass # Loop may already be stopped
68
+ loop.close()
69
+
70
+ logger.success("✅ Resources cleaned up successfully")
71
+ except Exception as e:
72
+ logger.warning(f"Error during cleanup: {e}")
73
  # --- Gradio Chat Logic ---
74
  async def respond(message: str, history: list[list[str]]):
75
  """
 
82
  Yields:
83
  A stream of response chunks to update the UI.
84
  """
85
+ global agent_instance, loop
86
  if not agent_instance:
87
  yield "Agent not initialized. Please check logs for errors."
88
  return
 
90
  logger.info(f"User query: '{message}'")
91
  full_response = ""
92
  try:
93
+ # Use the global event loop to run async generator
94
  async for chunk in agent_instance.ask(message):
95
  full_response += chunk
96
  yield full_response
 
124
 
125
  # --- App Launch ---
126
  if __name__ == "__main__":
127
+ # Register cleanup handler
128
+ atexit.register(cleanup_resources)
129
+
130
+ # Initialize agent before launching Gradio
131
+ initialize_agent_sync()
132
 
133
  settings = get_settings()
134
  logger.info("Launching Gradio app...")
135
+
136
+ try:
137
+ demo.launch(
138
+ server_name=settings.server_host,
139
+ server_port=settings.server_port,
140
+ show_error=True,
141
+ )
142
+ except KeyboardInterrupt:
143
+ logger.info("Received interrupt signal")
144
+ finally:
145
+ cleanup_resources()
config.py CHANGED
@@ -6,7 +6,7 @@ All settings can be configured via environment variables or .env file.
6
  """
7
 
8
  from pydantic_settings import BaseSettings, SettingsConfigDict
9
- from pydantic import Field, validator
10
  from functools import lru_cache
11
  import os
12
 
@@ -101,7 +101,8 @@ class Settings(BaseSettings):
101
  description="Gradio server port"
102
  )
103
 
104
- @validator("openai_api_key")
 
105
  def validate_api_key(cls, v):
106
  """Ensure API key is provided and not empty."""
107
  v = v or ""
@@ -113,7 +114,8 @@ class Settings(BaseSettings):
113
  )
114
  return v
115
 
116
- @validator("log_level")
 
117
  def validate_log_level(cls, v):
118
  """Ensure log level is valid."""
119
  valid_levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
@@ -125,12 +127,13 @@ class Settings(BaseSettings):
125
  )
126
  return v_upper
127
 
128
- @validator("chunk_overlap")
129
- def validate_overlap(cls, v, values):
 
130
  """Ensure chunk overlap is less than chunk size."""
131
- if "chunk_size" in values and v >= values["chunk_size"]:
132
  raise ValueError(
133
- f"chunk_overlap ({v}) must be less than chunk_size ({values['chunk_size']})"
134
  )
135
  return v
136
 
 
6
  """
7
 
8
  from pydantic_settings import BaseSettings, SettingsConfigDict
9
+ from pydantic import Field, field_validator
10
  from functools import lru_cache
11
  import os
12
 
 
101
  description="Gradio server port"
102
  )
103
 
104
+ @field_validator("openai_api_key")
105
+ @classmethod
106
  def validate_api_key(cls, v):
107
  """Ensure API key is provided and not empty."""
108
  v = v or ""
 
114
  )
115
  return v
116
 
117
+ @field_validator("log_level")
118
+ @classmethod
119
  def validate_log_level(cls, v):
120
  """Ensure log level is valid."""
121
  valid_levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
 
127
  )
128
  return v_upper
129
 
130
+ @field_validator("chunk_overlap")
131
+ @classmethod
132
+ def validate_overlap(cls, v, info):
133
  """Ensure chunk overlap is less than chunk size."""
134
+ if info.data and "chunk_size" in info.data and v >= info.data["chunk_size"]:
135
  raise ValueError(
136
+ f"chunk_overlap ({v}) must be less than chunk_size ({info.data['chunk_size']})"
137
  )
138
  return v
139
 
models/embeddings.py CHANGED
@@ -110,6 +110,17 @@ class EmbeddingsClient:
110
  self.cache = None
111
  logger.info(f"✅ EmbeddingsClient initialized (model: {self.model}, cache: disabled)")
112
 
 
 
 
 
 
 
 
 
 
 
 
113
  def _get_cache_key(self, text: str) -> str:
114
  """
115
  Generate cache key for text.
 
110
  self.cache = None
111
  logger.info(f"✅ EmbeddingsClient initialized (model: {self.model}, cache: disabled)")
112
 
113
+ def close(self):
114
+ """Close cache and clean up resources."""
115
+ try:
116
+ if self.cache is not None:
117
+ self.cache.close()
118
+ logger.info("EmbeddingsClient cache closed")
119
+ if hasattr(self.client, 'close'):
120
+ self.client.close()
121
+ except Exception as e:
122
+ logger.warning(f"Error closing EmbeddingsClient: {e}")
123
+
124
  def _get_cache_key(self, text: str) -> str:
125
  """
126
  Generate cache key for text.
requirements.txt CHANGED
@@ -1,9 +1,9 @@
1
  # Core application libraries
2
- gradio
3
- openai
4
- lancedb
5
- pydantic-settings
6
- loguru
7
- langdetect
8
- diskcache
9
- pandas
 
1
  # Core application libraries
2
+ gradio>=4.0.0,<5.0.0
3
+ openai>=1.0.0,<2.0.0
4
+ lancedb>=0.3.0,<1.0.0
5
+ pydantic-settings>=2.0.0,<3.0.0
6
+ loguru>=0.7.0,<1.0.0
7
+ langdetect>=1.0.0,<2.0.0
8
+ diskcache>=5.6.0,<6.0.0
9
+ pandas>=2.0.0,<3.0.0
vector_store_client.py CHANGED
@@ -354,6 +354,6 @@ class VectorStoreClient:
354
  # but we clear references to help garbage collection
355
  self._table = None
356
  self._db = None
357
- logger.debug("VectorStoreClient resources cleared")
358
  except Exception as e:
359
  logger.warning(f"Error during VectorStoreClient cleanup: {e}")
 
354
  # but we clear references to help garbage collection
355
  self._table = None
356
  self._db = None
357
+ logger.info("VectorStoreClient resources cleared")
358
  except Exception as e:
359
  logger.warning(f"Error during VectorStoreClient cleanup: {e}")