Intelligent-Internet
/

II-Medical-8B

Text Generation

text-generation-inference

Model card Files Files and versions

pvduy commited on May 15, 2025

Commit

0cabb6d

·

verified ·

1 Parent(s): 51dd958

Update README.md

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -11,12 +11,7 @@ tags: []
 ## I. Model Overview
-II-Medical-8B is a medical reasoning model trained on a [comprehensive dataset](https://huggingface.co/datasets/Intelligent-Internet/II-Medical-Reasoning-SFT-V0) of medical knowledge. The model is designed to enhance AI capabilities in medical.
-![Model Benchmark](https://cdn-uploads.huggingface.co/production/uploads/6389496ff7d3b0df092095ed/uvporIhY4_WN5cGaGF1Cm.png)
-Our II-Medical-8B model achieved a 40% score on HealthBench, an open-source benchmark evaluating the performance and safety of large language models in healthcare. This performance is comparable to OpenAI's o1 reasoning model and GPT-4.5, OpenAI's largest and most advanced model to date.
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6389496ff7d3b0df092095ed/S90HEqD6UJCme-1_17IJw.png)
 ## II. Training Methodology
@@ -45,6 +40,12 @@ For RL stage we setup training with:
 ## III. Evaluation Results
 We evaluate on ten medical QA benchmarks include MedMCQA, MedQA, PubMedQA, medical related questions from MMLU-Pro and GPQA, small QA sets from Lancet and the New England
 Journal of Medicine,  4 Options  and 5 Options splits from the MedBullets platform and MedXpertQA.

 ## I. Model Overview
+II-Medical-8B is the newest advanced large language model developed by Intelligent Internet, specifically engineered to enhance AI-driven medical reasoning. Following the positive reception of our previous [II-Medical-7B-Preview](https://huggingface.co/Intelligent-Internet/II-Medical-7B-Preview), this new iteration significantly advances the capabilities of medical question answering,
 ## II. Training Methodology
 ## III. Evaluation Results
+![Model Benchmark](https://cdn-uploads.huggingface.co/production/uploads/6389496ff7d3b0df092095ed/uvporIhY4_WN5cGaGF1Cm.png)
+Our II-Medical-8B model achieved a 40% score on HealthBench, an open-source benchmark evaluating the performance and safety of large language models in healthcare. This performance is comparable to OpenAI's o1 reasoning model and GPT-4.5, OpenAI's largest and most advanced model to date.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6389496ff7d3b0df092095ed/S90HEqD6UJCme-1_17IJw.png)
 We evaluate on ten medical QA benchmarks include MedMCQA, MedQA, PubMedQA, medical related questions from MMLU-Pro and GPQA, small QA sets from Lancet and the New England
 Journal of Medicine,  4 Options  and 5 Options splits from the MedBullets platform and MedXpertQA.