hamsinimk
/

doctor_note_summarization_llm

Model card Files Files and versions

hamsinimk commited on Dec 1, 2025

Commit

3354519

·

verified ·

1 Parent(s): 03875b1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ Based on previous experimentation with LoRA, I think the method does decently we
 | mistralai/Mistral-7B-Instruct-v0.2 | 0.77            | 0.37           | 0.77 | 0.85       |
 |                                    |                 |                |      |            |
-To benchmark the model, a general, medical reasoning, and summarization specific benchmarks were utilized. For the general benchmark, Massive Multitask Language Understanding (MMLU) Philosophy (Caballar & Stryker, 2025) was chosen and for the summarization specific benchmark, Extreme Summarization (XSum) was used to evaluate the model’s ability to generate effective summaries/abstracts when given long inputs that may be unstructured or have lots of technical language (Narayan et al., 1970). My general benchmark plan was as follows:
 medqa_4options (Domain / task specific): Assess model’s ability to perform medical reasoning when given complex multiple choice questions and be able to understand medical information

 | mistralai/Mistral-7B-Instruct-v0.2 | 0.77            | 0.37           | 0.77 | 0.85       |
 |                                    |                 |                |      |            |
+To benchmark the model, general, medical reasoning, and summarization specific benchmarks were utilized. For the general benchmark, Massive Multitask Language Understanding (MMLU) Philosophy (Caballar & Stryker, 2025) was chosen and for the summarization specific benchmark, Extreme Summarization (XSum) was used to evaluate the model’s ability to generate effective summaries/abstracts when given long inputs that may be unstructured or have lots of technical language (Narayan et al., 1970). My general benchmark plan was as follows:
 medqa_4options (Domain / task specific): Assess model’s ability to perform medical reasoning when given complex multiple choice questions and be able to understand medical information