--- license: apache-2.0 base_model: - openai/gpt-oss-120b - deepseek-ai/DeepSeek-V3.1 tags: - chat-interface - gpt-oss-120b-chat-interface --- # Model Card: GPT-OSS-120B Chat Interface ```markdown --- license: apache-2.0 tags: - mlx - chat-ui - local-ai - gpt - python - pyqt5 --- # GPT-OSS-120B Chat Interface ## Model Description This is a modern, feature-rich chat interface for the GPT-OSS-120B model running on Apple MLX framework. The interface provides a user-friendly way to interact with the 120-billion parameter open-source language model locally on Apple Silicon hardware. ## Model Overview - **Model Name:** GPT-OSS-120B (4-bit quantized) - **Framework:** Apple MLX - **Interface:** PyQt5-based desktop application - **Hardware Requirements:** Apple Silicon with sufficient RAM (recommended: M3 Ultra with 512GB RAM) ## Features - 🎨 Modern, responsive UI with PyQt5 - 💬 Real-time chat interface with message history - ⚡ Local inference on Apple Silicon - 📝 Markdown support with syntax highlighting - 💾 Conversation export functionality - ⚙️ Adjustable generation parameters - 🎯 Code block detection and formatting - 📊 Performance monitoring ## UI Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ MAIN WINDOW │ ├───────────────────────────────┬─────────────────────────────────────────┤ │ LEFT PANEL │ CHAT AREA │ │ │ │ │ ┌─────────────────────────┐ │ ┌───────────────────────────────────┐ │ │ │ MODEL INFO │ │ │ │ │ │ │ - Model details │ │ │ ┌─────────────────────────────┐ │ │ │ │ - Hardware specs │ │ │ │ CHAT MESSAGE (User) │ │ │ │ │ - Performance metrics │ │ │ │ - Avatar + timestamp │ │ │ │ │ │ │ │ │ - Formatted content │ │ │ │ └─────────────────────────┘ │ │ └─────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────────────────┐ │ │ ┌─────────────────────────────┐ │ │ │ │ GENERATION SETTINGS │ │ │ │ CHAT MESSAGE (Assistant) │ │ │ │ │ - Max tokens control │ │ │ │ - Avatar + timestamp │ │ │ │ │ │ │ │ │ - Formatted content │ │ │ │ └─────────────────────────┘ │ │ │ - Generation time │ │ │ │ │ │ └─────────────────────────────┘ │ │ │ │ │ ... │ │ │ ┌─────────────────────────┐ │ │ │ │ │ │ CONVERSATION TOOLS │ │ │ ┌─────────────────────────────┐ │ │ │ │ - Clear conversation │ │ │ │ INPUT AREA │ │ │ │ │ - Export chat │ │ │ │ - Multi-line text input │ │ │ │ │ │ │ │ │ - Character counter │ │ │ │ └─────────────────────────┘ │ │ │ - Send button │ │ │ │ │ │ └─────────────────────────────┘ │ │ │ ┌─────────────────────────┐ │ │ │ │ │ │ STATUS INDICATOR │ │ │ │ │ │ │ - Loading/ready state │ │ │ │ │ │ │ │ │ │ │ │ │ └─────────────────────────┘ │ │ │ │ │ │ │ │ │ └───────────────────────────────┴───────────────────────────────────────┘ ``` ## Development This interface was developed using PyQt5 and integrates with the MLX-LM library for efficient inference on Apple Silicon. The UI features a responsive design with: 1. **Threaded Operations:** Model loading and text generation run in background threads 2. **Custom Widgets:** Specialized chat message widgets with formatting 3. **Syntax Highlighting:** Code detection and highlighting in responses 4. **Modern Styling:** Clean, professional interface with appropriate spacing and colors ## DeepSeek Involvement The `gpt_oss_ui.py` Python script was created with assistance from DeepSeek's AI models, which helped design the architecture, implement the PyQt5 interface components, and ensure proper integration with the MLX inference backend. ## Usage 1. Install requirements: `pip install PyQt5 markdown mlx-lm` 2. Run the application: `python gpt_oss_ui.py` 3. Wait for model to load (first time will download the model) 4. Start chatting with the GPT-OSS-120B model ## Performance On an M3 Ultra with 512GB RAM: - Model load time: ~2-3 minutes (first time) - Inference speed: ~95 tokens/second - Memory usage: Optimized with 4-bit quantization ## Limitations - Requires significant RAM for the 120B parameter model - Currently only supports Apple Silicon hardware - Model loading can be time-consuming on first run ## Ethical Considerations This interface is designed for local use, ensuring privacy as all processing happens on-device. Users should still follow responsible AI practices when using the model. ```