Add files using upload-large-folder tool

Browse files

Files changed (13) hide show

.ipynb_checkpoints/README-checkpoint.md +79 -0
README.md +79 -0
added_tokens.json +28 -0
chat_template.jinja +61 -0
config.json +94 -0
merges.txt +0 -0
model.safetensors.index.json +0 -0
model17_0.safetensors +3 -0
model18_0.safetensors +3 -0
model19_0.safetensors +3 -0
model1_0.safetensors +3 -0
model7_0.safetensors +3 -0
special_tokens_map.json +31 -0

.ipynb_checkpoints/README-checkpoint.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+license: apache-2.0
+tags:
+  - 中医大模型
+#model-type:
+##如 gpt、phi、llama、chatglm、baichuan 等
+#- gpt
+#domain:
+##如 nlp、cv、audio、multi-modal
+#- nlp
+#language:
+##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
+#- cn
+#metrics:
+##如 CIDEr、Blue、ROUGE 等
+#- CIDEr
+#tags:
+##各种自定义，包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
+#- pretrained
+#tools:
+##如 vllm、fastchat、llamacpp、AdaSeq 等
+#- vllm
+  - 心语心言
+  - 医疗
+  - 医疗大模型
+language:
+  - zh
+frameworks: PyTorch
+tasks:
+  - text-generation
+base_model:
+  - Qwen/Qwen3-Next-80B-A3B-Instruct
+base_model_relation: finetune
+metrics:
+  - accuracy
+---
+# DeepPulse-80B TCM Large Model Series
+**DeepPulse (深度把脉)** is the core achievement of 心语心言's open-source Traditional Chinese Medicine (TCM) large model series. This series of models uses Qwen3-Next-80B as the base model and has undergone deep fine-tuning using a self-built high-quality TCM clinical medical dataset. This release includes two versions:
+* **DeepPulse-80B-Thinking-V0.1**: Focuses on complex clinical reasoning and assisted diagnosis, achieving first place in total score in public evaluations, demonstrating top-tier logical reasoning capabilities in the TCM domain.
+* **DeepPulse-80B-Instruct-V0.1**: Possesses excellent TCM instruction-following capabilities, suitable for a wide range of TCM Q&A and interactive scenarios, with a comprehensive ranking of sixth.
+# Public TCM Benchmark Metrics Comparison (MedBench - TCM-5CEval)
+TCM-5CEval is an authoritative evaluation benchmark for TCM large models, comprising the following five subtasks that comprehensively assess the model's TCM capabilities:
+* **TCM-Exam (中医考试)**: Evaluates the mastery and application of fundamental TCM theories (Yin-Yang, Zang-Fu organs, etc.) and diagnostics knowledge.
+* **TCM-LitQA (典籍问答)**: Tests deep understanding and reasoning of classic TCM texts such as "Huangdi Neijing" and "Shanghan Lun".
+* **TCM-MRCD (临床诊疗)**: Simulates real clinical scenarios, evaluating the model's ability to analyze medical cases, perform pattern differentiation, and make prescription decisions.
+* **TCM-CMM (中药方剂)**: Measures the model's knowledge of Chinese materia medica properties, effects, compatibility contraindications, and formula applications.
+* **TCM-ClinNPT (非药物疗法)**: Assesses ability in acupoint selection for acupuncture, Tuina massage techniques, and pattern-based treatment for specific clinical scenarios.
+| No. | Model Name | Organization/Team Name | Release Date | Type | Parameters | Total Score | TCM-Exam | TCM-LitQA | TCM-MRCD | TCM-CMM | TCM-ClinNPT |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| 1 | <font color="red">DeepPulse-80B-Thinking-V0.1</font> | <font color="red">心语心言</font> | <font color="red">2025/12/23</font> | <font color="red">开源</font> | <font color="red">80B</font> | <font color="red">71.3</font> | <font color="red">83.0</font> | <font color="red">45.5</font> | <font color="red">75.4</font> | <font color="red">84.9</font> | <font color="red">67.6</font> |
+| 2 | HKR_TCM_HW_v1 | 港仔机器人主动健管团队 | 2025/12/12 | 闭源 | 671B | 70.8 | 85.4 | 44.2 | 73.1 | 83.8 | 67.5 |
+| 3 | Gemini-2.5-Pro-nothinking | Google | 2025/03/25 | 闭源 | N/A | 69.2 | 77.9 | 62.0 | 72.4 | 72.6 | 61.2 |
+| 4 | DeepSeek-V3.2 | DeepSeek | 2025/12/01 | 开源 | 671B | 66.8 | 74.5 | 44.4 | 66.8 | 80.0 | 68.3 |
+| 5 | Grok-4 | xAI | 2025/07/09 | 闭源 | N/A | 66.6 | 73.0 | 59.3 | 68.4 | 68.0 | 64.2 |
+| 6 | <font color="red">DeepPulse-80B-Instruct-V0.1</font> | <font color="red">心语心言</font> | <font color="red">2025/12/23</font> | <font color="red">开源</font> | <font color="red">80B</font> | <font color="red">66.2</font> | <font color="red">74.4</font> | <font color="red">40.7</font> | <font color="red">70.6</font> | <font color="red">79.7</font> | <font color="red">65.6</font> |
+| 7 | Qwen3-235B-A22B-Thinking-2507 | Alibaba | 2025/08/17 | 开源 | 235B | 64.8 | 75.5 | 40.3 | 68.5 | 78.2 | 61.5 |
+| 8 | Claude-Sonnet-4.5 | Anthropic | 2025/09/29 | 闭源 | N/A | 64.8 | 69.8 | 59.3 | 67.2 | 71.7 | 56.0 |
+| 9 | GPT-5 | OpenAI | 2025/08/07 | 闭源 | N/A | 63.6 | 75.0 | 51.9 | 64.1 | 66.6 | 60.6 |
+| 10 | Qwen3-Next-80B-A3B-Thinking | Alibaba | 2025/09/15 | 开源 | 80B | 63.5 | 76.0 | 38.2 | 66.2 | 77.9 | 59.4 |
+| 11 | Llama-4-maverick | Meta | 2025/04/06 | 开源 | 400B | 57.2 | 72.1 | 51.3 | 63.8 | 54.4 | 44.3 |
+| 12 | GPT-4o | OpenAI | 2025/05/13 | 闭源 | 200B | 55.9 | 66.5 | 46.9 | 60.9 | 57.1 | 47.9 |
+> Note: "N/A" in the Parameters column indicates that the model's parameter count has not been publicly disclosed.
+>
+> Except for `DeepSeek-V3.2`, `Qwen3-235B-A22B-Thinking-2507`, `Qwen3-Next-80B-A3B-Thinking` which are self-tested deployment data, other models reference publicly available leaderboard data.
+>
+> TCM-5CEval: https://medbench.opencompass.org.cn/track-detail/tcmeval

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+license: apache-2.0
+tags:
+  - 中医大模型
+#model-type:
+##如 gpt、phi、llama、chatglm、baichuan 等
+#- gpt
+#domain:
+##如 nlp、cv、audio、multi-modal
+#- nlp
+#language:
+##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
+#- cn
+#metrics:
+##如 CIDEr、Blue、ROUGE 等
+#- CIDEr
+#tags:
+##各种自定义，包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
+#- pretrained
+#tools:
+##如 vllm、fastchat、llamacpp、AdaSeq 等
+#- vllm
+  - 心语心言
+  - 医疗
+  - 医疗大模型
+language:
+  - zh
+frameworks: PyTorch
+tasks:
+  - text-generation
+base_model:
+  - Qwen/Qwen3-Next-80B-A3B-Instruct
+base_model_relation: finetune
+metrics:
+  - accuracy
+---
+# DeepPulse-80B TCM Large Model Series
+**DeepPulse (深度把脉)** is the core achievement of 心语心言's open-source Traditional Chinese Medicine (TCM) large model series. This series of models uses Qwen3-Next-80B as the base model and has undergone deep fine-tuning using a self-built high-quality TCM clinical medical dataset. This release includes two versions:
+* **DeepPulse-80B-Thinking-V0.1**: Focuses on complex clinical reasoning and assisted diagnosis, achieving first place in total score in public evaluations, demonstrating top-tier logical reasoning capabilities in the TCM domain.
+* **DeepPulse-80B-Instruct-V0.1**: Possesses excellent TCM instruction-following capabilities, suitable for a wide range of TCM Q&A and interactive scenarios, with a comprehensive ranking of sixth.
+# Public TCM Benchmark Metrics Comparison (MedBench - TCM-5CEval)
+TCM-5CEval is an authoritative evaluation benchmark for TCM large models, comprising the following five subtasks that comprehensively assess the model's TCM capabilities:
+* **TCM-Exam (中医考试)**: Evaluates the mastery and application of fundamental TCM theories (Yin-Yang, Zang-Fu organs, etc.) and diagnostics knowledge.
+* **TCM-LitQA (典籍问答)**: Tests deep understanding and reasoning of classic TCM texts such as "Huangdi Neijing" and "Shanghan Lun".
+* **TCM-MRCD (临床诊疗)**: Simulates real clinical scenarios, evaluating the model's ability to analyze medical cases, perform pattern differentiation, and make prescription decisions.
+* **TCM-CMM (中药方剂)**: Measures the model's knowledge of Chinese materia medica properties, effects, compatibility contraindications, and formula applications.
+* **TCM-ClinNPT (非药物疗法)**: Assesses ability in acupoint selection for acupuncture, Tuina massage techniques, and pattern-based treatment for specific clinical scenarios.
+| No. | Model Name | Organization/Team Name | Release Date | Type | Parameters | Total Score | TCM-Exam | TCM-LitQA | TCM-MRCD | TCM-CMM | TCM-ClinNPT |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| 1 | <font color="red">DeepPulse-80B-Thinking-V0.1</font> | <font color="red">心语心言</font> | <font color="red">2025/12/23</font> | <font color="red">开源</font> | <font color="red">80B</font> | <font color="red">71.3</font> | <font color="red">83.0</font> | <font color="red">45.5</font> | <font color="red">75.4</font> | <font color="red">84.9</font> | <font color="red">67.6</font> |
+| 2 | HKR_TCM_HW_v1 | 港仔机器人主动健管团队 | 2025/12/12 | 闭源 | 671B | 70.8 | 85.4 | 44.2 | 73.1 | 83.8 | 67.5 |
+| 3 | Gemini-2.5-Pro-nothinking | Google | 2025/03/25 | 闭源 | N/A | 69.2 | 77.9 | 62.0 | 72.4 | 72.6 | 61.2 |
+| 4 | DeepSeek-V3.2 | DeepSeek | 2025/12/01 | 开源 | 671B | 66.8 | 74.5 | 44.4 | 66.8 | 80.0 | 68.3 |
+| 5 | Grok-4 | xAI | 2025/07/09 | 闭源 | N/A | 66.6 | 73.0 | 59.3 | 68.4 | 68.0 | 64.2 |
+| 6 | <font color="red">DeepPulse-80B-Instruct-V0.1</font> | <font color="red">心语心言</font> | <font color="red">2025/12/23</font> | <font color="red">开源</font> | <font color="red">80B</font> | <font color="red">66.2</font> | <font color="red">74.4</font> | <font color="red">40.7</font> | <font color="red">70.6</font> | <font color="red">79.7</font> | <font color="red">65.6</font> |
+| 7 | Qwen3-235B-A22B-Thinking-2507 | Alibaba | 2025/08/17 | 开源 | 235B | 64.8 | 75.5 | 40.3 | 68.5 | 78.2 | 61.5 |
+| 8 | Claude-Sonnet-4.5 | Anthropic | 2025/09/29 | 闭源 | N/A | 64.8 | 69.8 | 59.3 | 67.2 | 71.7 | 56.0 |
+| 9 | GPT-5 | OpenAI | 2025/08/07 | 闭源 | N/A | 63.6 | 75.0 | 51.9 | 64.1 | 66.6 | 60.6 |
+| 10 | Qwen3-Next-80B-A3B-Thinking | Alibaba | 2025/09/15 | 开源 | 80B | 63.5 | 76.0 | 38.2 | 66.2 | 77.9 | 59.4 |
+| 11 | Llama-4-maverick | Meta | 2025/04/06 | 开源 | 400B | 57.2 | 72.1 | 51.3 | 63.8 | 54.4 | 44.3 |
+| 12 | GPT-4o | OpenAI | 2025/05/13 | 闭源 | 200B | 55.9 | 66.5 | 46.9 | 60.9 | 57.1 | 47.9 |
+> Note: "N/A" in the Parameters column indicates that the model's parameter count has not been publicly disclosed.
+>
+> Except for `DeepSeek-V3.2`, `Qwen3-235B-A22B-Thinking-2507`, `Qwen3-Next-80B-A3B-Thinking` which are self-tested deployment data, other models reference publicly available leaderboard data.
+>
+> TCM-5CEval: https://medbench.opencompass.org.cn/track-detail/tcmeval

added_tokens.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "</think>": 151668,
+  "</tool_call>": 151658,
+  "</tool_response>": 151666,
+  "<think>": 151667,
+  "<tool_call>": 151657,
+  "<tool_response>": 151665,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,61 @@

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if message.content is string %}
+        {%- set content = message.content %}
+    {%- else %}
+        {%- set content = '' %}
+    {%- endif %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+{%- endif %}

config.json ADDED Viewed

	@@ -0,0 +1,94 @@

+{
+  "architectures": [
+    "Qwen3NextForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 151643,
+  "decoder_sparse_step": 1,
+  "dtype": "bfloat16",
+  "eos_token_id": 151645,
+  "full_attention_interval": 4,
+  "head_dim": 256,
+  "hidden_act": "silu",
+  "hidden_size": 2048,
+  "initializer_range": 0.02,
+  "intermediate_size": 5120,
+  "layer_types": [
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention",
+    "linear_attention",
+    "linear_attention",
+    "linear_attention",
+    "full_attention"
+  ],
+  "linear_conv_kernel_dim": 4,
+  "linear_key_head_dim": 128,
+  "linear_num_key_heads": 16,
+  "linear_num_value_heads": 32,
+  "linear_value_head_dim": 128,
+  "max_position_embeddings": 262144,
+  "mlp_only_layers": [],
+  "model_type": "qwen3_next",
+  "moe_intermediate_size": 512,
+  "norm_topk_prob": true,
+  "num_attention_heads": 16,
+  "num_experts": 512,
+  "num_experts_per_tok": 10,
+  "num_hidden_layers": 48,
+  "num_key_value_heads": 2,
+  "output_router_logits": false,
+  "partial_rotary_factor": 0.25,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 10000000,
+  "router_aux_loss_coef": 0.001,
+  "shared_expert_intermediate_size": 512,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.57.1",
+  "use_cache": true,
+  "use_sliding_window": false,
+  "vocab_size": 151936
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

model17_0.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0b224fe5b651e0b0da8b1208776bb86d18690f4ff6a2d1d85a671a89beaca3f6
+size 4832125960

model18_0.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8fe7a9d44b35f45564a134b8888a3170ad830cddc08e9c1e7fced650f5e1ee76
+size 4832125960

model19_0.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2d4facec77c41938e366e7ff290cfe9934362d3dfb5688b5ec167351988b45a7
+size 4832125960

model1_0.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:13dedd057368d95a8a03539b3bbd14ad013aa91309e468577cbc572227a9a864
+size 4832124808

model7_0.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9866a670e13ef5324617f069ed411b666488de7cd6b6e8287b6c490ed9a0cfff
+size 4832124808

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}