| --- |
| title: Rox AI |
| emoji: π€ |
| colorFrom: indigo |
| colorTo: purple |
| sdk: docker |
| pinned: false |
| license: mit |
| app_port: 7860 |
| --- |
| |
| # Rox AI |
|
|
| A production-ready AI chat interface with multi-model support, file processing, real-time internet search, and seamless conversations. |
|
|
| ## Features |
|
|
| ### Ultra-Fast Performance |
| - Optimized streaming with instant chunk delivery (0ms flush interval) |
| - No rate limiting - unlimited API access |
| - Extended context windows (4x larger) for comprehensive understanding |
| - Parallel file processing for maximum speed |
| - Smart caching and request deduplication |
| - Ultra-fast retry logic (50-200ms) |
|
|
| ### Multi-Model Support |
| Choose from 7 powerful standalone LLM models: |
|
|
| | Model | Parameters | Specialty | |
| |-------|------------|-----------| |
| | Rox Core | 405B | Fast and efficient for everyday tasks | |
| | Rox 2.1 Turbo | 671B | Deep thinking and reasoning | |
| | Rox 3.5 Coder | 480B | Optimized for coding and development | |
| | Rox 4.5 Turbo | 685B | Advanced reasoning and analysis | |
| | Rox 5 Ultra | 14.8T datasets | Most powerful flagship model | |
| | Rox 6 Dyno | Latest Gen | Dynamic thinker with native vision | |
| | Rox 7 Coder | Latest Gen | Ultimate coding powerhouse with reasoning | |
|
|
| All Rox AI models are standalone, proprietary models developed from scratch by Rox AI Technologies. |
|
|
| **Rox 6 Dyno - The Latest Innovation:** |
| - Native multimodal vision built directly into the model architecture |
| - Extended context window (64k tokens) for long conversations |
| - Deep reasoning with transparent thinking process |
| - Superior at complex analysis and multi-step problem solving |
| - Processes images directly without requiring separate vision models |
|
|
| ### DeepResearch Mode (Available for Rox 5 Ultra, Rox 6 Dyno, and Rox 7 Coder) |
|
|
| DeepResearch is a premium research feature available for Rox 5 Ultra, Rox 6 Dyno, and Rox 7 Coder that provides comprehensive, in-depth analysis on any topic using real-time web data. |
|
|
| **How to Enable:** |
| 1. Select "Rox 5 Ultra", "Rox 6 Dyno", or "Rox 7 Coder" model from the model selector |
| 2. Toggle the "DeepResearch" switch in the input area |
| 3. Ask your question - the AI will conduct thorough research before responding |
|
|
| **What DeepResearch Does:** |
| - Executes 18+ search query variations across multiple search engines |
| - Reads up to 20 full articles for comprehensive understanding |
| - Analyzes 15+ different sources for accuracy |
| - Cross-references information across multiple sources |
| - Prioritizes the latest and most current information |
|
|
| **Search Sources Used:** |
| | Source | Type | |
| |--------|------| |
| | SearXNG | Meta-search (Google, Bing, DuckDuckGo) | |
| | DuckDuckGo | Privacy-focused search | |
| | Wikipedia | Encyclopedia | |
| | Bing | Web search | |
| | arXiv | Research papers | |
| | GitHub | Code repositories | |
| | Reddit | Community discussions | |
| | Google News | Latest news | |
| | Hacker News | Tech discussions | |
| | StackOverflow | Programming Q&A | |
| | NPM/PyPI | Package registries | |
| | And more... | Specialized APIs | |
|
|
| **Response Characteristics:** |
| - Minimum 4500+ words for comprehensive coverage |
| - Structured with clear sections and headings |
| - Cites sources throughout the response |
| - Uses only numeric numbers (1, 2, 3) never Roman numerals |
| - Includes latest data from current year |
| - Covers all aspects: history, current state, future trends |
|
|
| **DeepResearch Configuration:** |
| | Setting | Value | |
| |---------|-------| |
| | Max Tokens | 32,768 | |
| | Temperature | 0.35 (focused) | |
| | Timeout | 15 minutes | |
| | Articles Read | Up to 20 | |
| | Search Variations | 18 queries | |
| | Min Sources | 15 | |
|
|
| **Visual Indicators:** |
| - Real-time status updates during research phase |
| - "DeepResearch" badge on responses |
| - Research statistics (searches performed, articles read) |
| - Badge preserved in PDF exports |
|
|
| ### Rox Vision - Integrated Image Understanding |
|
|
| Rox Vision is our dedicated vision-language model that powers image understanding across most Rox LLM models. It is seamlessly integrated and activates automatically when images are uploaded. |
|
|
| **Vision Models:** |
| | Model | Parameters | Role | |
| |-------|------------|------| |
| | Rox Vision | 90B | Primary vision model for image analysis | |
| | Rox Vision Max | Advanced | Backup model for enhanced reliability | |
|
|
| **How It Works:** |
| 1. User uploads an image with a question |
| 2. For Rox Core, 2.1 Turbo, 3.5 Coder, 4.5 Turbo, and 5 Ultra: Rox Vision automatically analyzes the image and extracts visual information |
| 3. For Rox 6 Dyno: Native vision processes images directly (no separate vision model needed) |
| 4. The analysis is passed to the selected LLM |
| 5. The LLM generates an intelligent response using the visual data |
|
|
| **Capabilities:** |
| - Scene analysis and composition understanding |
| - Object detection and identification |
| - Text extraction (OCR) from images and screenshots |
| - Visual reasoning and Q&A |
| - Support for JPG, PNG, GIF, WebP, and BMP formats |
|
|
| Note: Rox Vision is not a separately selectable model. It is integrated into Rox Core, 2.1 Turbo, 3.5 Coder, 4.5 Turbo, and 5 Ultra, and activates automatically when images are uploaded. Rox 6 Dyno has its own native vision capabilities built-in. |
|
|
| ### Screen Share - Voice-Controlled Screen Interaction (Desktop Only) |
|
|
| Screen Share is an exclusive desktop feature that combines screen capture, voice recognition, AI vision, and text-to-speech for a hands-free AI experience. |
|
|
| **Key Features:** |
| - π€ **Voice Input**: Speak your questions naturally - no typing needed |
| - ποΈ **Visual Understanding**: AI sees and analyzes your screen content |
| - π **Voice Responses**: AI reads answers aloud automatically |
| - π¬ **Floating Window**: Real-time display of prompts and responses |
| - π€ **All Models**: Choose from all 7 Rox AI models |
|
|
| **How It Works:** |
| 1. Click the Screen Share button (monitor icon in header) |
| 2. Select your preferred AI model |
| 3. Grant screen sharing and microphone permissions |
| 4. Start speaking - your words appear in real-time |
| 5. After 2.5 seconds of silence, AI automatically: |
| - Captures a screenshot |
| - Processes your question + screen content |
| - Responds with text and voice |
|
|
| **Requirements:** |
| - Desktop computer (1024px+ width) |
| - Chrome 94+, Edge 94+, or Safari 13+ |
| - Microphone and screen sharing permissions |
| - Not available on mobile devices |
|
|
| **Use Cases:** |
| - π» **Coding Help**: Share code editor, ask for debugging help |
| - π **Learning**: Share tutorials, get explanations |
| - π§ **Troubleshooting**: Share error messages, get solutions |
| - π¨ **Design Review**: Share mockups, get feedback |
| - π **Data Analysis**: Share charts, get insights |
|
|
| **Documentation:** |
| - [User Guide](SCREENSHARE_USER_GUIDE.md) - Complete instructions |
| - [Technical Docs](SCREENSHARE_FEATURE.md) - Developer reference |
| - [Quick Reference](SCREENSHARE_QUICK_REFERENCE.md) - Cheat sheet |
|
|
| **Vision Processing:** |
| - **Rox 6 Dyno**: Native vision - processes screenshots directly (fastest) |
| - **Other Models**: Use Rox Vision for image analysis (automatic) |
|
|
|
|
| ### Live Internet Search |
| - Real-time web search for latest news, events, and information |
| - Multiple search sources with intelligent fallback |
| - Automatic detection of queries requiring live data |
| - Visual indicator shows when responses use internet data |
|
|
| ### Specialized API Integrations |
|
|
| Rox AI includes several free API integrations that provide specialized data without requiring API keys: |
|
|
| | API | Purpose | Trigger Examples | |
| |-----|---------|------------------| |
| | Open-Meteo | Weather forecasts and conditions | "Weather in Tokyo", "Temperature in New York" | |
| | Currency API | Live exchange rates | "Dollar to rupee", "USD to INR", "Exchange rate" | |
| | CoinGecko | Cryptocurrency prices | "Bitcoin price", "ETH to USD", "Dogecoin today" | |
| | TheSportsDB | Live sports scores | "IPL score", "RCB vs CSK", "NBA results", "Premier League" | |
| | Yahoo Finance | Stock market data | "Nifty today", "Reliance stock price", "Sensex live" | |
| | Open Library | Book information and search | "Books by Stephen King", "Who wrote 1984" | |
| | arXiv | Research papers and academic studies | "Research on machine learning", "Latest papers on quantum computing" | |
| | IP-API | Geolocation from IP addresses | "What is my IP", "My location" | |
| | GitHub | Repository and code search | "GitHub repos for React", "Code libraries for Python" | |
|
|
| **How Specialized APIs Work:** |
| 1. User query is analyzed for specialized patterns |
| 2. If a pattern matches (weather, currency, crypto, sports, stocks, books, research, etc.), the appropriate API is called |
| 3. Results are formatted and returned directly for accurate, structured data |
| 4. If specialized API fails, system falls back to general web search |
|
|
| **Real-Time Data (No Caching):** |
| Weather, Currency, Cryptocurrency, Sports, Stock, and IP queries always fetch fresh data - they are never cached to ensure accuracy. |
|
|
| **Benefits:** |
| - More accurate and structured data for specific query types |
| - Faster responses for specialized queries |
| - No API keys required (100% free services) |
| - Automatic fallback ensures reliability |
|
|
| ### File Processing |
| Upload and analyze documents: |
| - PDF parsing with text extraction |
| - Word documents (.docx) with full text extraction |
| - Excel spreadsheets (.xlsx) |
| - PowerPoint presentations (.pptx) |
| - RTF documents |
| - Code files (60+ languages) |
| - Text and data files (CSV, JSON, YAML, XML, etc.) |
| - Images with Rox Vision analysis |
|
|
| ### Modern UI/UX |
| - Dark/Light theme toggle |
| - Smooth animations |
| - Mobile-responsive design |
| - Code syntax highlighting |
| - Math rendering with KaTeX |
|
|
| ### Advanced Features |
| - DeepResearch mode for comprehensive analysis (Rox 5 Ultra, Rox 6 Dyno, and Rox 7 Coder) |
| - **Screen Share (Desktop Only)** - Share your screen and interact with AI using voice commands |
| - Conversation history with persistence |
| - Message editing and regeneration |
| - Text-to-speech for responses |
| - PDF export with DeepResearch badge |
| - Keyboard shortcuts |
| - PWA support (installable) |
|
|
| ## Deployment |
|
|
| Deploy to Hugging Face Spaces using Docker, or run locally: |
|
|
| ```bash |
| # Local development |
| npm install |
| cp .env.example .env # Configure your NVIDIA API key |
| npm start |
| ``` |
|
|
| ### Environment Variables |
|
|
| | Variable | Required | Description | |
| |----------|----------|-------------| |
| | `NVIDIA_API_KEY` | Yes | Your NVIDIA API key from build.nvidia.com | |
| | `PORT` | No | Server port (default: 7860) | |
| | `HOST` | No | Server host (default: 0.0.0.0) | |
| | `NODE_ENV` | No | Environment mode (production/development) | |
|
|
| ## API Endpoints |
|
|
| | Endpoint | Method | Description | |
| |----------|--------|-------------| |
| | `/api/chat` | POST | Send messages to AI | |
| | `/api/health` | GET | Health check with system info | |
| | `/api/models` | GET | List available models | |
| | `/api/version` | GET | Get server version | |
|
|
| ## Security |
|
|
| - Input validation and sanitization |
| - XSS protection |
| - CORS configuration |
| - Rate limiting (1000 req/min) |
| - Security headers (CSP, HSTS, etc.) |
| - No sensitive data logging |
| - Non-root Docker user |
|
|
| ## License |
|
|
| MIT License |
|
|
| --- |
|
|
| Built by Mohammad Faiz, CEO & Founder of Rox AI Technologies |
|
|