DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

This model is a high-performance merge designed to bridge Mathematical Logic and Reasoning. It was constructed using MergeKit with the DARE-TIES method to preserve specialized weights from both source models.

Merge Methodology

The merge integrates the logical foundations of Qwen2.5-Math with the distilled reasoning capabilities of DeepSeek-R1. This combination aims to improve accuracy in structured tasks such as USMLE-style Q&A and ICD-10 clinical coding.

  • Merge Method: dare_ties
  • Base Model: Qwen/Qwen2.5-1.5B
  • Models Integrated:
    • Qwen/Qwen2.5-Math-1.5B-Instruct (Weight: 0.5)
    • DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B (Weight: 0.5)
  • Parameters:
    • Density: 0.53
    • Normalize: True
    • Int8 Mask: True

Usage

This model is intended for research purposes in the medical domain. It excels at tasks requiring Chain-of-Thought (CoT) explanation before providing a final medical answer.

Citation (Journal Submission)

If using this model for research, please cite the merge methodology and the source models accordingly.

Downloads last month
48
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for olusegunola/DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

Merge model
this model
Quantizations
2 models