Upload folder using huggingface_hub

4a9bcd5 verified 16 days ago

1.37 kB

license: apache-2.0
base_model:
  - Qwen/Qwen2.5-1.5B
  - Qwen/Qwen2.5-Math-1.5B-Instruct
  - DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B
tags:
  - merge
  - mergekit
  - dare-ties
  - medical-reasoning
  - clinical-coding

DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

This model is a high-performance merge designed to bridge Mathematical Logic and Reasoning. It was constructed using MergeKit with the DARE-TIES method to preserve specialized weights from both source models.

Merge Methodology

The merge integrates the logical foundations of Qwen2.5-Math with the distilled reasoning capabilities of DeepSeek-R1. This combination aims to improve accuracy in structured tasks such as USMLE-style Q&A and ICD-10 clinical coding.

Merge Method: dare_ties
Base Model: Qwen/Qwen2.5-1.5B
Models Integrated:
- Qwen/Qwen2.5-Math-1.5B-Instruct (Weight: 0.5)
- DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B (Weight: 0.5)
Parameters:
- Density: 0.53
- Normalize: True
- Int8 Mask: True

Usage

This model is intended for research purposes in the medical domain. It excels at tasks requiring Chain-of-Thought (CoT) explanation before providing a final medical answer.

Citation (Journal Submission)

If using this model for research, please cite the merge methodology and the source models accordingly.