uaytug commited on
Commit
54cd57d
·
verified ·
1 Parent(s): 23dabf7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -6
README.md CHANGED
@@ -23,17 +23,17 @@ Converted using the latest llama.cpp (CUDA-accelerated quantization).
23
  ### Available Files
24
 
25
  **16-bit**
26
- - `uCoder-8b-base-BF16.gguf` → Highest precision float (similar to original, ~16 GB)
27
 
28
  **8-bit**
29
- - `uCoder-8b-base-Q8_0.gguf` → Near-lossless
30
 
31
  **6-bit**
32
  - `uCoder-8b-base-Q6_K.gguf`
33
 
34
  **5-bit**
35
  - `uCoder-8b-base-Q5_K_S.gguf`
36
- - `uCoder-8b-base-Q5_K_M.gguf` → Great quality
37
 
38
  **4-bit** (most popular range)
39
  - `uCoder-8b-base-Q4_K_M.gguf` → **Recommended balance**
@@ -47,7 +47,42 @@ Converted using the latest llama.cpp (CUDA-accelerated quantization).
47
  **2-bit**
48
  - `uCoder-8b-base-Q2_K.gguf`
49
 
50
- ### Usage
51
- Use with llama.cpp, LM Studio, Ollama, etc.
52
 
53
- This is a **base** model — best for code/further fine-tuning.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ### Available Files
24
 
25
  **16-bit**
26
+ - `uCoder-8b-base-BF16.gguf` → **Highest precision float (similar to original, ~16 GB)**
27
 
28
  **8-bit**
29
+ - `uCoder-8b-base-Q8_0.gguf` → **Near-lossless**
30
 
31
  **6-bit**
32
  - `uCoder-8b-base-Q6_K.gguf`
33
 
34
  **5-bit**
35
  - `uCoder-8b-base-Q5_K_S.gguf`
36
+ - `uCoder-8b-base-Q5_K_M.gguf` → **Great quality**
37
 
38
  **4-bit** (most popular range)
39
  - `uCoder-8b-base-Q4_K_M.gguf` → **Recommended balance**
 
47
  **2-bit**
48
  - `uCoder-8b-base-Q2_K.gguf`
49
 
50
+ ## Original Model Information
 
51
 
52
+
53
+ # uCoder-8b-base
54
+
55
+ ![Model Architecture](https://img.shields.io/badge/Model-Qwen3--8B-blue) ![Task](https://img.shields.io/badge/Task-Coding-green) ![License](https://img.shields.io/badge/License-Apache_2.0-red) ![Method](https://img.shields.io/badge/Method-TIES_Merge-orange)
56
+
57
+ **uCoder-8b-base** is a coding-specialized 8B parameter model created by TIES-merging five high-quality distilled models based on **Qwen3-8B**. This merge is designed to combine advanced reasoning capabilities with state-of-the-art coding performance, making it an ideal base for further instruction tuning or direct code generation tasks.
58
+
59
+ ## 🚀 Model Description
60
+
61
+ This model leverages the **TIES (Trimming, Electing, and Signs)** merging method to effectively combine the weights of multiple expert models without losing the specific competencies of each. By normalizing the weights and focusing on high-reasoning distillations from top-tier frontier models (GPT-5.x, Claude 4.5, etc.), uCoder-8b-base achieves a robust balance between logic and syntax accuracy.
62
+
63
+ ### Key Features
64
+ * **High Reasoning:** Inherits logic handling from Claude and GPT-based distills.
65
+ * **Polyglot Coding:** Proficient in Python, JavaScript, C++, Rust, and other major languages.
66
+ * **Base Model:** Built on the powerful Qwen3-8B architecture.
67
+ * **Efficient:** 8B size allows for local inference on consumer hardware (12GB+ VRAM recommended for FP16, less for quantized).
68
+
69
+ ## 🧩 Merged Models
70
+
71
+ The following models were merged using equal weights to create uCoder-8b-base:
72
+
73
+ | Model Name | Primary Contribution |
74
+ | :--- | :--- |
75
+ | **Qwen3 8B GPT 5.2 High Reasoning Distill** | Advanced logic & multi-step reasoning |
76
+ | **Qwen3 8B Claude 4.5 Opus High Reasoning Distill** | Safe code generation & detailed explanations |
77
+ | **Qwen3 8B Gemini 3 Pro Preview Distill** | Long-context handling & creative solutions |
78
+ | **Qwen3 8B DeepSeek v3.2 Speciale Distill** | Mathematical problem solving & optimization |
79
+ | **Qwen3 8B GPT 5 Codex Distill** | Syntax accuracy & API implementation |
80
+
81
+ ## Limitations
82
+
83
+ * **Base Model Nature:** This is a base model (merge), not fully instruction-tuned for chat. While it can handle chat formats, it performs best when fine-tuned or given specific few-shot examples.
84
+ * **Coding Focus:** While capable of general reasoning, its domain expertise is heavily skewed towards programming and technical tasks.
85
+
86
+ ## License
87
+
88
+ This model is released under the **Apache 2.0** license.