Spaces:

HipFil98
/

ELAN_bot

Sleeping

App Files Files Community

HipFil98 commited on Jul 2, 2025

Commit

d7dede5

verified ·

1 Parent(s): a90a20a

Update README.md

Browse files

Files changed (1) hide show

README.md +44 -100

README.md CHANGED Viewed

@@ -1,6 +1,15 @@
 ---
-pinned: true
 ---
 # ELAN-Bot 🤖
 A virtual assistant designed to help users with the ELAN annotation software. The bot can answer questions about ELAN usage and modify EAF (ELAN Annotation Format) files based on user instructions.
@@ -13,113 +22,48 @@ A virtual assistant designed to help users with the ELAN annotation software. Th
 - **Vector Search**: Uses semantic search to find relevant information from documentation
 - **Powered by Llama 3.3 70B**: Advanced language model for accurate responses
-## Project Structure
-```
-elan-bot/
-├── app.py                          # Main application entry point
-├── requirements.txt                # Python dependencies
-├── README.md                      # Project documentation
-├── .env.example                   # Environment variables example
-├── config/
-│   └── settings.py                # Configuration settings
-├── prompts/
-│   ├── __init__.py
-│   ├── system_prompts.py          # System prompts
-│   ├── user_prompts.py            # User prompts
-│   └── assistant_prompts.py       # Assistant prompts
-├── services/
-│   ├── __init__.py
-│   ├── vector_search.py           # Vector search functionality
-│   ├── llm_service.py             # LLM interaction service
-│   └── elan_assistant.py          # Main assistant coordinator
-├── utils/
-│   ├── __init__.py
-│   └── text_processing.py         # Text processing utilities
-├── ui/
-│   ├── __init__.py
-│   └── gradio_interface.py        # Gradio interface components
-└── data/
-    └── qdrant_data/               # Vector database storage
-```
-## Installation
-1. Clone the repository:
-```bash
-git clone <repository-url>
-cd elan-bot
-```
-2. Create virtual environment (recommended):
-```bash
-python -m venv venv
-source venv/bin/activate  # On Windows: venv\Scripts\activate
-```
-3. Install dependencies:
-```bash
-pip install -r requirements.txt
-```
-4. Set up environment variables:
-```bash
-cp .env.example .env
-# Edit .env file with your Hugging Face token
-```
-5. Ensure you have the Qdrant vector database set up with ELAN documentation in the `data/qdrant_data` directory.
 ## Usage
-Run the application:
-```bash
-python app.py
-```
-The Gradio interface will launch and you can:
-- Ask questions about ELAN: "How can I add a new tier in ELAN?"
-- Modify EAF files: Paste your EAF content with instructions at the beginning
-## Configuration
-Modify `config/settings.py` to adjust:
-- Model settings (encoder, LLM, tokenizer)
-- Vector database configuration
-- Text processing parameters
-- UI settings
-## Components
-### Services
-- **VectorSearchService**: Handles semantic search through ELAN documentation using sentence transformers and Qdrant
-- **LLMService**: Manages interactions with the Llama 3.3 70B model for generating responses and processing XML
-- **ElanAssistant**: Main coordinator that routes requests between question answering and XML modification workflows
-### Utils
-- **TextProcessor**: Utilities for splitting large EAF files into manageable chunks and recombining results
-### UI
-- **GradioInterface**: Handles the Gradio chat interface setup and configuration
-### Configuration
-- **settings.py**: Centralized configuration for all application parameters
-- **prompts/**: Organized prompt templates separated by type (system, user, assistant)
-## Development
-The project follows a clean architecture pattern with separation of concerns:
-- `config/`: Application configuration
-- `prompts/`: All prompt templates organized by type
-- `services/`: Core business logic and external service integrations
-- `utils/`: Utility functions and helpers
-- `ui/`: User interface components
-- `data/`: Data storage (vector database)
-Each module is self-contained with clear interfaces and minimal dependencies.
-## License
-[Add your license information here]

 ---
+title: ELAN-Bot
+emoji: 🤖
+colorFrom: blue
+colorTo: green
+sdk: gradio
+sdk_version: 4.44.0
+app_file: app.py
+pinned: false
+license: mit
 ---
 # ELAN-Bot 🤖
 A virtual assistant designed to help users with the ELAN annotation software. The bot can answer questions about ELAN usage and modify EAF (ELAN Annotation Format) files based on user instructions.
 - **Vector Search**: Uses semantic search to find relevant information from documentation
 - **Powered by Llama 3.3 70B**: Advanced language model for accurate responses
 ## Usage
+Simply interact with the chat interface:
+- **Ask questions**: "How can I add a new tier in ELAN?"
+- **Modify EAF files**: Paste your EAF content with instructions at the beginning like:
+  ```
+  instructions: change the participant name from Eleonora to Gianni
+  <?xml version="1.0" encoding="UTF-8"?>
+  <ANNOTATION_DOCUMENT...>
+  ```
+## Examples
+Try these sample questions:
+- "How can I add a new tier in ELAN?"
+- "¿Cómo puedo exportar anotaciones en formato txt?"
+- "Come posso cercare all'interno delle annotazioni?"
+## Configuration
+The app requires a HF_TOKEN environment variable to be set in the Hugging Face Spaces settings for accessing the Llama model.
+## Technical Details
+- **Backend**: Python with Gradio interface
+- **Vector Search**: Qdrant + SentenceTransformers
+- **LLM**: Meta Llama 3.3 70B Instruct via Hugging Face Inference API
+- **Text Processing**: tiktoken for efficient chunking
+## Project Structure
+```
+elan-bot/
+├── app.py                          # Main application entry point
+├── requirements.txt                # Python dependencies
+├── config/
+│   └── settings.py                # Configuration settings
+├── prompts/                       # Organized prompt templates
+├── services/                      # Core business logic
+├── utils/                         # Utility functions
+├── ui/                           # Gradio interface components
+└── data/                         # Vector database storage
+```