Update README.md
Browse files
README.md
CHANGED
|
@@ -11,12 +11,7 @@ tags: []
|
|
| 11 |
|
| 12 |
## I. Model Overview
|
| 13 |
|
| 14 |
-
II-Medical-8B is
|
| 15 |
-
|
| 16 |
-

|
| 17 |
-
|
| 18 |
-
Our II-Medical-8B model achieved a 40% score on HealthBench, an open-source benchmark evaluating the performance and safety of large language models in healthcare. This performance is comparable to OpenAI's o1 reasoning model and GPT-4.5, OpenAI's largest and most advanced model to date.
|
| 19 |
-

|
| 20 |
|
| 21 |
## II. Training Methodology
|
| 22 |
|
|
@@ -45,6 +40,12 @@ For RL stage we setup training with:
|
|
| 45 |
|
| 46 |
## III. Evaluation Results
|
| 47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
We evaluate on ten medical QA benchmarks include MedMCQA, MedQA, PubMedQA, medical related questions from MMLU-Pro and GPQA, small QA sets from Lancet and the New England
|
| 49 |
Journal of Medicine, 4 Options and 5 Options splits from the MedBullets platform and MedXpertQA.
|
| 50 |
|
|
|
|
| 11 |
|
| 12 |
## I. Model Overview
|
| 13 |
|
| 14 |
+
II-Medical-8B is the newest advanced large language model developed by Intelligent Internet, specifically engineered to enhance AI-driven medical reasoning. Following the positive reception of our previous [II-Medical-7B-Preview](https://huggingface.co/Intelligent-Internet/II-Medical-7B-Preview), this new iteration significantly advances the capabilities of medical question answering,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
## II. Training Methodology
|
| 17 |
|
|
|
|
| 40 |
|
| 41 |
## III. Evaluation Results
|
| 42 |
|
| 43 |
+

|
| 44 |
+
|
| 45 |
+
Our II-Medical-8B model achieved a 40% score on HealthBench, an open-source benchmark evaluating the performance and safety of large language models in healthcare. This performance is comparable to OpenAI's o1 reasoning model and GPT-4.5, OpenAI's largest and most advanced model to date.
|
| 46 |
+

|
| 47 |
+
|
| 48 |
+
|
| 49 |
We evaluate on ten medical QA benchmarks include MedMCQA, MedQA, PubMedQA, medical related questions from MMLU-Pro and GPQA, small QA sets from Lancet and the New England
|
| 50 |
Journal of Medicine, 4 Options and 5 Options splits from the MedBullets platform and MedXpertQA.
|
| 51 |
|