File size: 7,068 Bytes
6705b44 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | ---
license: apache-2.0
base_model:
- openai/gpt-oss-120b
- deepseek-ai/DeepSeek-V3.1
tags:
- chat-interface
- gpt-oss-120b-chat-interface
---
# Model Card: GPT-OSS-120B Chat Interface
```markdown
---
license: apache-2.0
tags:
- mlx
- chat-ui
- local-ai
- gpt
- python
- pyqt5
---
# GPT-OSS-120B Chat Interface
## Model Description
This is a modern, feature-rich chat interface for the GPT-OSS-120B model running on Apple MLX framework. The interface provides a user-friendly way to interact with the 120-billion parameter open-source language model locally on Apple Silicon hardware.
## Model Overview
- **Model Name:** GPT-OSS-120B (4-bit quantized)
- **Framework:** Apple MLX
- **Interface:** PyQt5-based desktop application
- **Hardware Requirements:** Apple Silicon with sufficient RAM (recommended: M3 Ultra with 512GB RAM)
## Features
- π¨ Modern, responsive UI with PyQt5
- π¬ Real-time chat interface with message history
- β‘ Local inference on Apple Silicon
- π Markdown support with syntax highlighting
- πΎ Conversation export functionality
- βοΈ Adjustable generation parameters
- π― Code block detection and formatting
- π Performance monitoring
## UI Architecture Diagram
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MAIN WINDOW β
βββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββ€
β LEFT PANEL β CHAT AREA β
β β β
β βββββββββββββββββββββββββββ β βββββββββββββββββββββββββββββββββββββ β
β β MODEL INFO β β β β β
β β - Model details β β β βββββββββββββββββββββββββββββββ β β
β β - Hardware specs β β β β CHAT MESSAGE (User) β β β
β β - Performance metrics β β β β - Avatar + timestamp β β β
β β β β β β - Formatted content β β β
β βββββββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β
β β β β β
β βββββββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β
β β GENERATION SETTINGS β β β β CHAT MESSAGE (Assistant) β β β
β β - Max tokens control β β β β - Avatar + timestamp β β β
β β β β β β - Formatted content β β β
β βββββββββββββββββββββββββββ β β β - Generation time β β β
β β β βββββββββββββββββββββββββββββββ β β
β β β ... β β
β βββββββββββββββββββββββββββ β β β β
β β CONVERSATION TOOLS β β β βββββββββββββββββββββββββββββββ β β
β β - Clear conversation β β β β INPUT AREA β β β
β β - Export chat β β β β - Multi-line text input β β β
β β β β β β - Character counter β β β
β βββββββββββββββββββββββββββ β β β - Send button β β β
β β β βββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββ β β β β
β β STATUS INDICATOR β β β β β
β β - Loading/ready state β β β β β
β β β β β β β
β βββββββββββββββββββββββββββ β β β β
β β β β β
βββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββ
```
## Development
This interface was developed using PyQt5 and integrates with the MLX-LM library for efficient inference on Apple Silicon. The UI features a responsive design with:
1. **Threaded Operations:** Model loading and text generation run in background threads
2. **Custom Widgets:** Specialized chat message widgets with formatting
3. **Syntax Highlighting:** Code detection and highlighting in responses
4. **Modern Styling:** Clean, professional interface with appropriate spacing and colors
## DeepSeek Involvement
The `gpt_oss_ui.py` Python script was created with assistance from DeepSeek's AI models, which helped design the architecture, implement the PyQt5 interface components, and ensure proper integration with the MLX inference backend.
## Usage
1. Install requirements: `pip install PyQt5 markdown mlx-lm`
2. Run the application: `python gpt_oss_ui.py`
3. Wait for model to load (first time will download the model)
4. Start chatting with the GPT-OSS-120B model
## Performance
On an M3 Ultra with 512GB RAM:
- Model load time: ~2-3 minutes (first time)
- Inference speed: ~95 tokens/second
- Memory usage: Optimized with 4-bit quantization
## Limitations
- Requires significant RAM for the 120B parameter model
- Currently only supports Apple Silicon hardware
- Model loading can be time-consuming on first run
## Ethical Considerations
This interface is designed for local use, ensuring privacy as all processing happens on-device. Users should still follow responsible AI practices when using the model.
``` |