|
|
--- |
|
|
library_name: mlx |
|
|
tags: |
|
|
- mlx |
|
|
- text-generation |
|
|
- apple-silicon |
|
|
- quantized |
|
|
base_model: uaytug/uCoder-8b-base |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- uaytug/ucoder-reasoning-ds |
|
|
--- |
|
|
|
|
|
# uCoder-8b-base-mlx |
|
|
|
|
|
This is an [MLX](https://github.com/ml-explore/mlx) format conversion of [uaytug/uCoder-8b-base](https://huggingface.co/uaytug/uCoder-8b-base) for efficient inference on Apple Silicon devices. |
|
|
|
|
|
## Available Quantizations |
|
|
|
|
|
This repository contains multiple quantization options: |
|
|
|
|
|
| Folder | Bits | Description | |
|
|
|--------|------|-------------| |
|
|
| `8bit/` | 8-bit | Higher quality, larger size | |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install mlx-lm |
|
|
``` |
|
|
|
|
|
### Usage |
|
|
|
|
|
```python |
|
|
from mlx_lm import load, generate |
|
|
|
|
|
# Load 4-bit quantized model (fastest, smallest) |
|
|
model, tokenizer = load("uaytug/uCoder-8b-base-mlx", adapter_path="4bit") |
|
|
|
|
|
# Or load 8-bit for higher quality |
|
|
# model, tokenizer = load("uaytug/uCoder-8b-base-mlx", adapter_path="8bit") |
|
|
|
|
|
# Generate text |
|
|
prompt = "def fibonacci(n):" |
|
|
response = generate(model, tokenizer, prompt=prompt, max_tokens=256) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Command Line |
|
|
|
|
|
```bash |
|
|
# Generate with 4-bit model |
|
|
mlx_lm.generate --model uaytug/uCoder-8b-base-mlx --adapter-path 4bit --prompt "def hello_world():" |
|
|
|
|
|
# Chat mode |
|
|
mlx_lm.chat --model uaytug/uCoder-8b-base-mlx --adapter-path 4bit |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
MLX provides optimized inference on Apple Silicon (M1/M2/M3/M4) with: |
|
|
- Unified memory architecture utilization |
|
|
- Metal GPU acceleration |
|
|
- Efficient memory management |
|
|
|
|
|
## Memory Requirements (Approximate) |
|
|
|
|
|
| Quantization | Memory Usage | |
|
|
|--------------|--------------| |
|
|
| 8-bit | ~8 GB | |
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: [uaytug/uCoder-8b-base](https://huggingface.co/uaytug/uCoder-8b-base) |
|
|
- **Architecture**: Qwen3 |
|
|
- **Parameters**: 8B |
|
|
- **Framework**: MLX |
|
|
|
|
|
## Original Model Information |
|
|
|
|
|
|
|
|
# uCoder-8b-base |
|
|
|
|
|
    |
|
|
|
|
|
**uCoder-8b-base** is a coding-specialized 8B parameter model created by TIES-merging five high-quality distilled models based on **Qwen3-8B**. This merge is designed to combine advanced reasoning capabilities with state-of-the-art coding performance, making it an ideal base for further instruction tuning or direct code generation tasks. |
|
|
|
|
|
## 🚀 Model Description |
|
|
|
|
|
This model leverages the **TIES (Trimming, Electing, and Signs)** merging method to effectively combine the weights of multiple expert models without losing the specific competencies of each. By normalizing the weights and focusing on high-reasoning distillations from top-tier frontier models (GPT-5.x, Claude 4.5, etc.), uCoder-8b-base achieves a robust balance between logic and syntax accuracy. |
|
|
|
|
|
### Key Features |
|
|
* **High Reasoning:** Inherits logic handling from Claude and GPT-based distills. |
|
|
* **Polyglot Coding:** Proficient in Python, JavaScript, C++, Rust, and other major languages. |
|
|
* **Base Model:** Built on the powerful Qwen3-8B architecture. |
|
|
* **Efficient:** 8B size allows for local inference on consumer hardware (12GB+ VRAM recommended for FP16, less for quantized). |
|
|
|
|
|
## 🧩 Merged Models |
|
|
|
|
|
The following models were merged using equal weights to create uCoder-8b-base: |
|
|
|
|
|
| Model Name | Primary Contribution | |
|
|
| :--- | :--- | |
|
|
| **Qwen3 8B GPT 5.2 High Reasoning Distill** | Advanced logic & multi-step reasoning | |
|
|
| **Qwen3 8B Claude 4.5 Opus High Reasoning Distill** | Safe code generation & detailed explanations | |
|
|
| **Qwen3 8B Gemini 3 Pro Preview Distill** | Long-context handling & creative solutions | |
|
|
| **Qwen3 8B DeepSeek v3.2 Speciale Distill** | Mathematical problem solving & optimization | |
|
|
| **Qwen3 8B GPT 5 Codex Distill** | Syntax accuracy & API implementation | |
|
|
|
|
|
## Limitations |
|
|
|
|
|
* **Base Model Nature:** This is a base model (merge), not fully instruction-tuned for chat. While it can handle chat formats, it performs best when fine-tuned or given specific few-shot examples. |
|
|
* **Coding Focus:** While capable of general reasoning, its domain expertise is heavily skewed towards programming and technical tasks. |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the **Apache 2.0** license. |