| license: apache-2.0 | |
| base_model: | |
| - Qwen/Qwen2.5-1.5B | |
| - Qwen/Qwen2.5-Math-1.5B-Instruct | |
| - DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B | |
| tags: | |
| - merge | |
| - mergekit | |
| - dare-ties | |
| - medical-reasoning | |
| - clinical-coding | |
| # DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb | |
| This model is a high-performance merge designed to bridge **Mathematical Logic** and **Reasoning**. It was constructed using **MergeKit** with the **DARE-TIES** method to preserve specialized weights from both source models. | |
| ## Merge Methodology | |
| The merge integrates the logical foundations of **Qwen2.5-Math** with the distilled reasoning capabilities of **DeepSeek-R1**. This combination aims to improve accuracy in structured tasks such as **USMLE-style Q&A** and **ICD-10 clinical coding**. | |
| - **Merge Method:** `dare_ties` | |
| - **Base Model:** `Qwen/Qwen2.5-1.5B` | |
| - **Models Integrated:** | |
| - `Qwen/Qwen2.5-Math-1.5B-Instruct` (Weight: 0.5) | |
| - `DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B` (Weight: 0.5) | |
| - **Parameters:** | |
| - Density: 0.53 | |
| - Normalize: True | |
| - Int8 Mask: True | |
| ## Usage | |
| This model is intended for research purposes in the medical domain. It excels at tasks requiring **Chain-of-Thought (CoT)** explanation before providing a final medical answer. | |
| ## Citation (Journal Submission) | |
| If using this model for research, please cite the merge methodology and the source models accordingly. | |