olusegunola
/

DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

medical-reasoning

clinical-coding

Model card Files Files and versions

DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb / README.md

olusegunola's picture

Upload folder using huggingface_hub

4a9bcd5 verified 22 days ago

|

history blame contribute delete

1.37 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-1.5B
	- Qwen/Qwen2.5-Math-1.5B-Instruct
	- DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B
	tags:
	- merge
	- mergekit
	- dare-ties
	- medical-reasoning
	- clinical-coding
	---

	# DeepSeek-R1-Distill-Merge-Qwen-Math-1.5Bb

	This model is a high-performance merge designed to bridge Mathematical Logic and Reasoning. It was constructed using MergeKit with the DARE-TIES method to preserve specialized weights from both source models.

	## Merge Methodology
	The merge integrates the logical foundations of Qwen2.5-Math with the distilled reasoning capabilities of DeepSeek-R1. This combination aims to improve accuracy in structured tasks such as USMLE-style Q&A and ICD-10 clinical coding.

	- Merge Method: `dare_ties`
	- Base Model: `Qwen/Qwen2.5-1.5B`
	- Models Integrated:
	- `Qwen/Qwen2.5-Math-1.5B-Instruct` (Weight: 0.5)
	- `DeepSeek-AI/DeepSeek-R1-Distill-Qwen-1.5B` (Weight: 0.5)
	- Parameters:
	- Density: 0.53
	- Normalize: True
	- Int8 Mask: True

	## Usage
	This model is intended for research purposes in the medical domain. It excels at tasks requiring Chain-of-Thought (CoT) explanation before providing a final medical answer.

	## Citation (Journal Submission)
	If using this model for research, please cite the merge methodology and the source models accordingly.