File size: 2,670 Bytes
21a36ea b75a16a 21a36ea b9e03e5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
language:
- en
- code
license: apache-2.0
tags:
- merge
- mergekit
- ties
- text-generation
- programming
library_name: transformers
pipeline_tag: text-generation
datasets:
- uaytug/ucoder-reasoning-ds
---
# uCoder-8b-base
   
**uCoder-8b-base** is a coding-specialized 8B parameter model created by TIES-merging five high-quality distilled models based on **Qwen3-8B**. This merge is designed to combine advanced reasoning capabilities with state-of-the-art coding performance, making it an ideal base for further instruction tuning or direct code generation tasks.
## 🚀 Model Description
This model leverages the **TIES (Trimming, Electing, and Signs)** merging method to effectively combine the weights of multiple expert models without losing the specific competencies of each. By normalizing the weights and focusing on high-reasoning distillations from top-tier frontier models (GPT-5.x, Claude 4.5, etc.), uCoder-8b-base achieves a robust balance between logic and syntax accuracy.
### Key Features
* **High Reasoning:** Inherits logic handling from Claude and GPT-based distills.
* **Polyglot Coding:** Proficient in Python, JavaScript, C++, Rust, and other major languages.
* **Base Model:** Built on the powerful Qwen3-8B architecture.
* **Efficient:** 8B size allows for local inference on consumer hardware (12GB+ VRAM recommended for FP16, less for quantized).
## 🧩 Merged Models
The following models were merged using equal weights to create uCoder-8b-base:
| Model Name | Primary Contribution |
| :--- | :--- |
| **Qwen3 8B GPT 5.2 High Reasoning Distill** | Advanced logic & multi-step reasoning |
| **Qwen3 8B Claude 4.5 Opus High Reasoning Distill** | Safe code generation & detailed explanations |
| **Qwen3 8B Gemini 3 Pro Preview Distill** | Long-context handling & creative solutions |
| **Qwen3 8B DeepSeek v3.2 Speciale Distill** | Mathematical problem solving & optimization |
| **Qwen3 8B GPT 5 Codex Distill** | Syntax accuracy & API implementation |
## Limitations
* **Base Model Nature:** This is a base model (merge), not fully instruction-tuned for chat. While it can handle chat formats, it performs best when fine-tuned or given specific few-shot examples.
* **Coding Focus:** While capable of general reasoning, its domain expertise is heavily skewed towards programming and technical tasks.
## License
This model is released under the **Apache 2.0** license. |