DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

This model is a high-performance merge designed to bridge Mathematical Logic and Reasoning. It was constructed using MergeKit with the DARE-TIES method to preserve specialized weights from both source models.

Merge Methodology

The merge integrates the logical foundations of Qwen2.5-Math with the distilled reasoning capabilities of DeepSeek-R1. This combination aims to improve accuracy in structured tasks such as USMLE-style Q&A and ICD-10 clinical coding.

Merge Method: dare_ties
Base Model: Qwen/Qwen2.5-1.5B
Models Integrated:
- Qwen/Qwen2.5-Math-1.5B-Instruct (Weight: 0.5)
- DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B (Weight: 0.5)
Parameters:
- Density: 0.53
- Normalize: True
- Int8 Mask: True

Usage

This model is intended for research purposes in the medical domain. It excels at tasks requiring Chain-of-Thought (CoT) explanation before providing a final medical answer.

Citation (Journal Submission)

If using this model for research, please cite the merge methodology and the source models accordingly.

Downloads last month: 48

Safetensors

Model size

2B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for olusegunola/DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

Qwen/Qwen2.5-1.5B

Qwen/Qwen2.5-Math-1.5B-Instruct

Merge model

this model

Quantizations

2 models