Spaces:
Runtime error
A newer version of the Gradio SDK is available: 6.13.0
AstraMind - Stage 1: Core Chat
A powerful AI chat application with dual provider support (OpenRouter & HuggingFace), featuring multi-model support, session persistence, comprehensive export options, and intelligent response caching.
Features
- Dual Provider Support:
- OpenRouter: GPT-4o-mini, Claude-3.5-Sonnet, Gemini-2.0-Flash, Llama-3.1-8B (requires API key)
- HuggingFace: OpenChat-3.5, Qwen-2.5-8B, Vicuna-13B, Qwen-2.5-32B (free, no API key needed)
- Smart Caching: Hash-based response caching to reduce API costs
- System Message Support: Custom system prompts to guide AI behavior
- Session Persistence: Auto-save and load chat sessions
- Comprehensive Exports: TXT, MD, JSON, CSV, Audio (TTS), and PDF formats
- Modern UI: Clean Gradio interface with Gradio's built-in theme system
- Token Tracking: Real-time token usage and cost calculation
- Streaming Responses: Real-time streaming for better UX
Installation
- Clone the repository and navigate to the stage-1-basic-chat directory:
cd stage-1-basic-chat
- Install dependencies:
pip install -r requirements.txt
- API Key (Optional):
- By default, the app uses free HuggingFace models (no API key needed)
- For OpenRouter models, enter your API key in the UI when launching
- Get your OpenRouter key from: https://openrouter.ai/keys
Quick Start
Option 1: Using the Run Script (Recommended)
./run_app.sh
Option 2: Manual Start
# Activate virtual environment
source venv/bin/activate
# Run the application
python app.py
The application will launch in your default browser at http://localhost:7860
Note: If the browser doesn't open automatically, manually navigate to http://localhost:7860
Using the Chat Interface
- Choose Provider:
- Leave API key empty: Uses free HuggingFace models (OpenChat, Qwen, Vicuna)
- Enter OpenRouter API key: Access premium models (GPT-4o-mini, Claude, Gemini)
- Click "Initialize" to load the models
- System Message (Optional): Expand the System Message accordion to add custom instructions
- Select Model: Choose from available models in the dropdown
- Start Chatting: Type your message and press Send or Enter
- Monitor Usage: View token count, session duration, and cache statistics in the sidebar
- Theme Control: Use Gradio's built-in theme switcher in the settings (βοΈ icon)
Exporting Conversations
- Filter Data: Use date range and role filters in the export panel
- Choose Format:
- TXT: Plain text with timestamps
- MD: Markdown with formatted code blocks
- JSON: Complete session data with metadata
- CSV: Spreadsheet format for analysis
- Audio: Text-to-speech conversion of assistant messages
- PDF: Professional formatted document
Project Structure
stage-1-basic-chat/
βββ app.py # Main Gradio application (run this!)
βββ src/
β βββ backend/
β β βββ chat_engine.py # OpenRouter integration & streaming
β β βββ cache.py # Response caching
β β βββ model_registry.py # Model configurations
β β βββ session_manager.py # Session persistence
β β βββ utils.py # Helper functions
β βββ frontend/
β βββ gradio_app/
β βββ export_utils.py # Export functionality
β βββ styles.css # Custom styling
β βββ mermaid_component.py # Mermaid diagram support
βββ chat-history/ # Saved chat sessions
βββ vectordb/ # (For future stages)
βββ user-projects/ # (For future stages)
βββ assets/ # UI assets
βββ .env # Environment configuration
βββ requirements.txt # Python dependencies
βββ run_app.sh # Quick launch script
βββ README.md # This file
Configuration
Model Configuration
Models are configured in src/backend/model_registry.py with:
- Model ID for OpenRouter
- Context window size
- Cost per 1K tokens
Cache Settings
Cache is configured in src/backend/cache.py:
- Default TTL: 3600 seconds (1 hour)
- Hash-based key generation
- Automatic expired entry cleanup
Session Management
Sessions are stored in chat-history/ as JSON files with:
- Complete message history
- Token usage tracking
- Session metadata
- Timestamps for each message
Development Notes
API Key Handling
During Stage 1 development, the application supports two modes:
- User-provided API key via UI (recommended)
- Fallback to .env file (for development)
Note: The fallback mechanism will be removed in future stages.
Token Counting
Token counting uses tiktoken for accurate estimation. Costs are calculated based on model pricing from OpenRouter.
Caching Strategy
The cache uses a hash of:
- User prompt
- Selected model
- Model parameters
This ensures identical queries return cached responses, reducing API costs.
Troubleshooting
API Key Issues
If you see connection errors:
- Verify your API key is correct in
.envor UI - Check your OpenRouter account has credits
- Ensure you have network connectivity
Module Import Errors
If you encounter import errors:
pip install -r requirements.txt --upgrade
Export Issues
For PDF exports, ensure reportlab is properly installed:
pip install reportlab --upgrade
For audio exports, you may need system dependencies for pyttsx3:
- macOS: No additional setup needed
- Linux:
sudo apt-get install espeak - Windows: No additional setup needed
Future Stages
- Stage 2: Streamlit Pro UI with authentication
- Stage 3: LinkedIn automation via n8n
- Stage 4: RAG pipeline with web search
- Stage 5: Image generation and vision models
Version
v0.1-core-chat - Initial release with core chat functionality
License
See LICENSE file in the root directory.