AeonDevWorks's picture
fix: Remove unsupported pwa parameter and upgrade Gradio to 4.44.1
4bf5d4d

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

AstraMind - Stage 1: Core Chat

A powerful AI chat application with dual provider support (OpenRouter & HuggingFace), featuring multi-model support, session persistence, comprehensive export options, and intelligent response caching.

Features

  • Dual Provider Support:
    • OpenRouter: GPT-4o-mini, Claude-3.5-Sonnet, Gemini-2.0-Flash, Llama-3.1-8B (requires API key)
    • HuggingFace: OpenChat-3.5, Qwen-2.5-8B, Vicuna-13B, Qwen-2.5-32B (free, no API key needed)
  • Smart Caching: Hash-based response caching to reduce API costs
  • System Message Support: Custom system prompts to guide AI behavior
  • Session Persistence: Auto-save and load chat sessions
  • Comprehensive Exports: TXT, MD, JSON, CSV, Audio (TTS), and PDF formats
  • Modern UI: Clean Gradio interface with Gradio's built-in theme system
  • Token Tracking: Real-time token usage and cost calculation
  • Streaming Responses: Real-time streaming for better UX

Installation

  1. Clone the repository and navigate to the stage-1-basic-chat directory:
cd stage-1-basic-chat
  1. Install dependencies:
pip install -r requirements.txt
  1. API Key (Optional):
    • By default, the app uses free HuggingFace models (no API key needed)
    • For OpenRouter models, enter your API key in the UI when launching
    • Get your OpenRouter key from: https://openrouter.ai/keys

Quick Start

Option 1: Using the Run Script (Recommended)

./run_app.sh

Option 2: Manual Start

# Activate virtual environment
source venv/bin/activate

# Run the application
python app.py

The application will launch in your default browser at http://localhost:7860

Note: If the browser doesn't open automatically, manually navigate to http://localhost:7860

Using the Chat Interface

  1. Choose Provider:
    • Leave API key empty: Uses free HuggingFace models (OpenChat, Qwen, Vicuna)
    • Enter OpenRouter API key: Access premium models (GPT-4o-mini, Claude, Gemini)
    • Click "Initialize" to load the models
  2. System Message (Optional): Expand the System Message accordion to add custom instructions
  3. Select Model: Choose from available models in the dropdown
  4. Start Chatting: Type your message and press Send or Enter
  5. Monitor Usage: View token count, session duration, and cache statistics in the sidebar
  6. Theme Control: Use Gradio's built-in theme switcher in the settings (βš™οΈ icon)

Exporting Conversations

  1. Filter Data: Use date range and role filters in the export panel
  2. Choose Format:
    • TXT: Plain text with timestamps
    • MD: Markdown with formatted code blocks
    • JSON: Complete session data with metadata
    • CSV: Spreadsheet format for analysis
    • Audio: Text-to-speech conversion of assistant messages
    • PDF: Professional formatted document

Project Structure

stage-1-basic-chat/
β”œβ”€β”€ app.py                       # Main Gradio application (run this!)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ backend/
β”‚   β”‚   β”œβ”€β”€ chat_engine.py       # OpenRouter integration & streaming
β”‚   β”‚   β”œβ”€β”€ cache.py             # Response caching
β”‚   β”‚   β”œβ”€β”€ model_registry.py    # Model configurations
β”‚   β”‚   β”œβ”€β”€ session_manager.py   # Session persistence
β”‚   β”‚   └── utils.py             # Helper functions
β”‚   └── frontend/
β”‚       └── gradio_app/
β”‚           β”œβ”€β”€ export_utils.py  # Export functionality
β”‚           β”œβ”€β”€ styles.css       # Custom styling
β”‚           └── mermaid_component.py  # Mermaid diagram support
β”œβ”€β”€ chat-history/                # Saved chat sessions
β”œβ”€β”€ vectordb/                    # (For future stages)
β”œβ”€β”€ user-projects/               # (For future stages)
β”œβ”€β”€ assets/                      # UI assets
β”œβ”€β”€ .env                         # Environment configuration
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ run_app.sh                   # Quick launch script
└── README.md                    # This file

Configuration

Model Configuration

Models are configured in src/backend/model_registry.py with:

  • Model ID for OpenRouter
  • Context window size
  • Cost per 1K tokens

Cache Settings

Cache is configured in src/backend/cache.py:

  • Default TTL: 3600 seconds (1 hour)
  • Hash-based key generation
  • Automatic expired entry cleanup

Session Management

Sessions are stored in chat-history/ as JSON files with:

  • Complete message history
  • Token usage tracking
  • Session metadata
  • Timestamps for each message

Development Notes

API Key Handling

During Stage 1 development, the application supports two modes:

  1. User-provided API key via UI (recommended)
  2. Fallback to .env file (for development)

Note: The fallback mechanism will be removed in future stages.

Token Counting

Token counting uses tiktoken for accurate estimation. Costs are calculated based on model pricing from OpenRouter.

Caching Strategy

The cache uses a hash of:

  • User prompt
  • Selected model
  • Model parameters

This ensures identical queries return cached responses, reducing API costs.

Troubleshooting

API Key Issues

If you see connection errors:

  1. Verify your API key is correct in .env or UI
  2. Check your OpenRouter account has credits
  3. Ensure you have network connectivity

Module Import Errors

If you encounter import errors:

pip install -r requirements.txt --upgrade

Export Issues

For PDF exports, ensure reportlab is properly installed:

pip install reportlab --upgrade

For audio exports, you may need system dependencies for pyttsx3:

  • macOS: No additional setup needed
  • Linux: sudo apt-get install espeak
  • Windows: No additional setup needed

Future Stages

  • Stage 2: Streamlit Pro UI with authentication
  • Stage 3: LinkedIn automation via n8n
  • Stage 4: RAG pipeline with web search
  • Stage 5: Image generation and vision models

Version

v0.1-core-chat - Initial release with core chat functionality

License

See LICENSE file in the root directory.