| | --- |
| | title: CosmicCat AI Assistant |
| | emoji: 🐱 |
| | colorFrom: purple |
| | colorTo: blue |
| | sdk: streamlit |
| | sdk_version: "1.24.0" |
| | app_file: app.py |
| | pinned: false |
| | --- |
| | |
| | # CosmicCat AI Assistant 🐱 |
| |
|
| | Your personal AI-powered life coaching assistant with a cosmic twist. |
| |
|
| | ## Features |
| |
|
| | - Personalized life coaching conversations with a space-cat theme |
| | - Redis-based conversation memory |
| | - Multiple LLM provider support (Ollama, Hugging Face, OpenAI) |
| | - Dynamic model selection |
| | - Remote Ollama integration via ngrok |
| | - Automatic fallback between providers |
| | - Cosmic Cascade mode for enhanced responses |
| |
|
| | ## How to Use |
| |
|
| | 1. Select a user from the sidebar |
| | 2. Configure your Ollama connection (if using remote Ollama) |
| | 3. Choose your preferred model |
| | 4. Start chatting with your CosmicCat AI Assistant! |
| |
|
| | ## Requirements |
| |
|
| | All requirements are specified in requirements.txt. The app automatically handles: |
| | - Streamlit UI |
| | - FastAPI backend (for future expansion) |
| | - Redis connection for persistent memory |
| | - Multiple LLM integrations |
| |
|
| | ## Environment Variables |
| |
|
| | Configure these in your Hugging Face Space secrets or local .env file: |
| |
|
| | - OLLAMA_HOST: Your Ollama server URL (default: ngrok URL) |
| | - LOCAL_MODEL_NAME: Default model name (default: mistral) |
| | - HF_TOKEN: Hugging Face API token (for Hugging Face models) |
| | - HF_API_ENDPOINT_URL: Hugging Face inference API endpoint |
| | - USE_FALLBACK: Whether to use fallback providers (true/false) |
| |
|
| | Note: Redis configuration is now hardcoded for reliability. |
| |
|
| | ## Provider Details |
| |
|
| | ### Ollama (Primary Local Provider) |
| |
|
| | Setup: |
| | 1. Install Ollama: https://ollama.com/download |
| | 2. Pull a model: ollama pull mistral |
| | 3. Start server: ollama serve |
| | 4. Configure ngrok: ngrok http 11434 |
| | 5. Set OLLAMA_HOST to your ngrok URL |
| | |
| | Advantages: |
| | - No cost for inference |
| | - Full control over models |
| | - Fast response times |
| | - Privacy - all processing local |
| | |
| | ### Hugging Face Inference API (Fallback) |
| | |
| | Current Endpoint: https://zxzbfrlg3ssrk7d9.us-east-1.aws.endpoints.huggingface.cloud |
| | |
| | Important Scaling Behavior: |
| | - ⚠️ Scale-to-Zero: Endpoint automatically scales to zero after 15 minutes of inactivity |
| | - ⏱️ Cold Start: Takes approximately 4 minutes to initialize when first requested |
| | - 🔄 Automatic Wake-up: Sending any request will automatically start the endpoint |
| | - 💰 Cost: $0.536/hour while running (not billed when scaled to zero) |
| | - 📍 Location: AWS us-east-1 (Intel Sapphire Rapids, 16vCPUs, 32GB RAM) |
| | |
| | Handling 503 Errors: |
| | When using the Hugging Face fallback, you may encounter 503 errors initially. This indicates the endpoint is initializing. Simply retry your request after 30-60 seconds, or wait for the initialization to complete (typically 4 minutes). |
| | |
| | Model: OpenAI GPT OSS 20B (Uncensored variant) |
| | |
| | ### OpenAI (Alternative Fallback) |
| | |
| | Configure with OPENAI_API_KEY environment variable. |
| | |
| | ## Switching Between Providers |
| | |
| | ### For Local Development (Windows/Ollama): |
| | |
| | 1. Install Ollama: |
| | ```bash |
| | # Download from https://ollama.com/download/OllamaSetup.exe |
| | Pull and run models: |
| | |
| | |
| | ollama pull mistral |
| | ollama pull llama3 |
| | ollama serve |
| | Start ngrok tunnel: |
| | |
| | |
| | ngrok http 11434 |
| | Update environment variables: |
| | |
| | |
| | OLLAMA_HOST=https://your-ngrok-url.ngrok-free.app |
| | LOCAL_MODEL_NAME=mistral |
| | USE_FALLBACK=false |
| | For Production Deployment: |
| | |
| | The application automatically handles provider fallback: |
| | |
| | Primary: Ollama (via ngrok) |
| | Secondary: Hugging Face Inference API |
| | Tertiary: OpenAI (if configured) |
| | Architecture |
| | This application consists of: |
| | |
| | Streamlit frontend (app.py) |
| | Core LLM abstraction (core/llm.py) |
| | Memory management (core/memory.py) |
| | Configuration management (utils/config.py) |
| | API endpoints (in api/ directory for future expansion) |
| | Built with Python, Streamlit, FastAPI, and Redis. |
| | |
| | Troubleshooting Common Issues |
| | 503 Errors with Hugging Face Fallback: |
| | |
| | Wait 4 minutes for cold start initialization |
| | Retry request after endpoint warms up |
| | Ollama Connection Issues: |
| | |
| | Verify ollama serve is running locally |
| | Check ngrok tunnel status |
| | Confirm ngrok URL matches OLLAMA_HOST |
| | Test with test_ollama_connection.py |
| | Redis Connection Problems: |
| |
|
| | The Redis configuration is now hardcoded for maximum reliability |
| | If issues persist, check network connectivity to Redis Cloud |
| | Model Not Found: |
| |
|
| | Pull required model: ollama pull <model-name> |
| | Check available models: ollama list |
| | Diagnostic Scripts: |
| |
|
| | Run python test_ollama_connection.py to verify Ollama connectivity. |
| | Run python diagnose_ollama.py for detailed connection diagnostics. |
| | Run python test_hardcoded_redis.py to verify Redis connectivity with hardcoded configuration. |
| | Redis Database Configuration |
| | The application now uses a non-SSL connection to Redis Cloud for maximum compatibility: |
| | |
| | |
| | import redis |
| | r = redis.Redis( |
| | host='redis-16717.c85.us-east-1-2.ec2.redns.redis-cloud.com', |
| | port=16717, |
| | username="default", |
| | password="bNQGmfkB2fRo4KrT3UXwhAUEUmgDClx7", |
| | decode_responses=True, |
| | socket_connect_timeout=15, |
| | socket_timeout=15, |
| | health_check_interval=30, |
| | retry_on_timeout=True |
| | ) |
| | Note: SSL is disabled due to record layer failures with Redis Cloud. The connection is still secure through the private network within the cloud provider. |
| | |
| | 🚀 Hugging Face Space Deployment |
| | This application is designed for deployment on Hugging Face Spaces with the following configuration: |
| |
|
| | Required HF Space Secrets: |
| |
|
| | OLLAMA_HOST - Your ngrok tunnel to Ollama server |
| | LOCAL_MODEL_NAME - Default: mistral:latest |
| | HF_TOKEN - Hugging Face API token (for HF endpoint access) |
| | HF_API_ENDPOINT_URL - Your custom HF inference endpoint |
| | TAVILY_API_KEY - For web search capabilities |
| | OPENWEATHER_API_KEY - For weather data integration |
| | Redis Configuration: The application uses hardcoded Redis Cloud credentials for persistent storage. |
| | |
| | Multi-Model Coordination |
| | Primary: Ollama (fast responses, local processing) |
| | Secondary: Hugging Face Endpoint (deep analysis, cloud processing) |
| | Coordination: Both work together, not fallback |
| | System Architecture |
| | The coordinated AI system automatically handles: |
| | |
| | External data gathering (web search, weather, time) |
| | Fast initial responses from Ollama |
| | Background HF endpoint initialization |
| | Deep analysis coordination |
| | Session persistence with Redis |
| | This approach will work perfectly in your HF Space environment where the variables are properly configured. The local demo will show the system architecture is correct and ready for deployment! |
| | |