File size: 4,301 Bytes
972eea3
 
 
 
 
 
 
 
 
d12e146
 
972eea3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0c5ab63
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
library_name: mlx
tags:
- mlx
- text-generation
- apple-silicon
- quantized
base_model: uaytug/uCoder-8b-base
license: apache-2.0
datasets:
- uaytug/ucoder-reasoning-ds
---

# uCoder-8b-base-mlx

This is an [MLX](https://github.com/ml-explore/mlx) format conversion of [uaytug/uCoder-8b-base](https://huggingface.co/uaytug/uCoder-8b-base) for efficient inference on Apple Silicon devices.

## Available Quantizations

This repository contains multiple quantization options:

| Folder | Bits | Description |
|--------|------|-------------|
| `4bit/` | 4-bit | Smallest size, fastest inference |

## Quick Start

### Installation

```bash
pip install mlx-lm
```

### Usage

```python
from mlx_lm import load, generate

# Load 4-bit quantized model (fastest, smallest)
model, tokenizer = load("uaytug/uCoder-8b-base-mlx", adapter_path="4bit")

# Or load 8-bit for higher quality
# model, tokenizer = load("uaytug/uCoder-8b-base-mlx", adapter_path="8bit")

# Generate text
prompt = "def fibonacci(n):"
response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
print(response)
```

### Command Line

```bash
# Generate with 4-bit model
mlx_lm.generate --model uaytug/uCoder-8b-base-mlx --adapter-path 4bit --prompt "def hello_world():"

# Chat mode
mlx_lm.chat --model uaytug/uCoder-8b-base-mlx --adapter-path 4bit
```

## Performance

MLX provides optimized inference on Apple Silicon (M1/M2/M3/M4) with:
- Unified memory architecture utilization
- Metal GPU acceleration
- Efficient memory management

## Memory Requirements (Approximate)

| Quantization | Memory Usage |
|--------------|--------------|
| 4-bit | ~4 GB |

## Model Details

- **Base Model**: [uaytug/uCoder-8b-base](https://huggingface.co/uaytug/uCoder-8b-base)
- **Architecture**: Qwen3
- **Parameters**: 8B
- **Framework**: MLX

## Original Model Information


# uCoder-8b-base

![Model Architecture](https://img.shields.io/badge/Model-Qwen3--8B-blue) ![Task](https://img.shields.io/badge/Task-Coding-green) ![License](https://img.shields.io/badge/License-Apache_2.0-red) ![Method](https://img.shields.io/badge/Method-TIES_Merge-orange)

**uCoder-8b-base** is a coding-specialized 8B parameter model created by TIES-merging five high-quality distilled models based on **Qwen3-8B**. This merge is designed to combine advanced reasoning capabilities with state-of-the-art coding performance, making it an ideal base for further instruction tuning or direct code generation tasks.

## 🚀 Model Description

This model leverages the **TIES (Trimming, Electing, and Signs)** merging method to effectively combine the weights of multiple expert models without losing the specific competencies of each. By normalizing the weights and focusing on high-reasoning distillations from top-tier frontier models (GPT-5.x, Claude 4.5, etc.), uCoder-8b-base achieves a robust balance between logic and syntax accuracy.

### Key Features
* **High Reasoning:** Inherits logic handling from Claude and GPT-based distills.
* **Polyglot Coding:** Proficient in Python, JavaScript, C++, Rust, and other major languages.
* **Base Model:** Built on the powerful Qwen3-8B architecture.
* **Efficient:** 8B size allows for local inference on consumer hardware (12GB+ VRAM recommended for FP16, less for quantized).

## 🧩 Merged Models

The following models were merged using equal weights to create uCoder-8b-base:

| Model Name | Primary Contribution |
| :--- | :--- |
| **Qwen3 8B GPT 5.2 High Reasoning Distill** | Advanced logic & multi-step reasoning |
| **Qwen3 8B Claude 4.5 Opus High Reasoning Distill** | Safe code generation & detailed explanations |
| **Qwen3 8B Gemini 3 Pro Preview Distill** | Long-context handling & creative solutions |
| **Qwen3 8B DeepSeek v3.2 Speciale Distill** | Mathematical problem solving & optimization |
| **Qwen3 8B GPT 5 Codex Distill** | Syntax accuracy & API implementation |

## Limitations

* **Base Model Nature:** This is a base model (merge), not fully instruction-tuned for chat. While it can handle chat formats, it performs best when fine-tuned or given specific few-shot examples.
* **Coding Focus:** While capable of general reasoning, its domain expertise is heavily skewed towards programming and technical tasks.

## License

This model is released under the **Apache 2.0** license.