olusegunola's picture
Upload folder using huggingface_hub
4a9bcd5 verified
metadata
license: apache-2.0
base_model:
  - Qwen/Qwen2.5-1.5B
  - Qwen/Qwen2.5-Math-1.5B-Instruct
  - DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B
tags:
  - merge
  - mergekit
  - dare-ties
  - medical-reasoning
  - clinical-coding

DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

This model is a high-performance merge designed to bridge Mathematical Logic and Reasoning. It was constructed using MergeKit with the DARE-TIES method to preserve specialized weights from both source models.

Merge Methodology

The merge integrates the logical foundations of Qwen2.5-Math with the distilled reasoning capabilities of DeepSeek-R1. This combination aims to improve accuracy in structured tasks such as USMLE-style Q&A and ICD-10 clinical coding.

  • Merge Method: dare_ties
  • Base Model: Qwen/Qwen2.5-1.5B
  • Models Integrated:
    • Qwen/Qwen2.5-Math-1.5B-Instruct (Weight: 0.5)
    • DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B (Weight: 0.5)
  • Parameters:
    • Density: 0.53
    • Normalize: True
    • Int8 Mask: True

Usage

This model is intended for research purposes in the medical domain. It excels at tasks requiring Chain-of-Thought (CoT) explanation before providing a final medical answer.

Citation (Journal Submission)

If using this model for research, please cite the merge methodology and the source models accordingly.