ClinicalIntelligence
/

saama_gemma

Image-Text-to-Text

Generated from Trainer

text-generation-inference

Model card Files Files and versions

praveenramesh commited on Feb 24

Commit

c347354

·

verified ·

1 Parent(s): c7cc2d1

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -14,12 +14,17 @@ licence: license
 This model is a fine-tuned version of [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it).
 It has been trained using [TRL](https://github.com/huggingface/trl).
-The ClinicalIntelligence/saama_gemma is a fine-tuned MedGemma model designed to transform unstructured clinical narratives—such as discharge notes—into structured, SDTM-aligned datasets (e.g., Adverse Events, Medical History, Procedures). Trained on an SME-curated dataset derived from MIMIC-III, the model treats clinical data extraction as a complex reasoning task, explicitly evaluating assertion, temporality, and causality to generate accurate, traceable JSON outputs. By learning regulatory semantics directly, it significantly outperforms base models in domain grounding and schema consistency. Users should note current limitations regarding context window constraints for lengthy notes, rare abbreviation handling, and the resolution of multi-domain entities.
 ## Quick start
 ```python
 import re
@@ -36,6 +41,7 @@ generator = pipeline(
 output = generator(
     [{"role": "user", "content": prefix + unstructured_text}],
     return_full_text=False,
 )[0]
 llm_output = output["generated_text"]

 This model is a fine-tuned version of [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it).
 It has been trained using [TRL](https://github.com/huggingface/trl).
+The [ClinicalIntelligence/saama_gemma](https://huggingface.co/ClinicalIntelligence/saama_gemma) is a fine-tuned MedGemma model designed to transform unstructured clinical narratives—such as discharge notes—into structured, SDTM-aligned datasets (e.g., Adverse Events, Medical History, Procedures). Trained on an SME-curated dataset derived from MIMIC-III, the model treats clinical data extraction as a complex reasoning task, explicitly evaluating assertion, temporality, and causality to generate accurate, traceable JSON outputs. By learning regulatory semantics directly, it significantly outperforms base models in domain grounding and schema consistency. Users should note current limitations regarding context window constraints for lengthy notes, rare abbreviation handling, and the resolution of multi-domain entities.
+## INSTALLATION
+```
+pip install -U transformers
+```
 ## Quick start
+**NOTE** - Adjust the **max_new_tokens** parameter as needed; it is set to 3000 by default.
 ```python
 import re
 output = generator(
     [{"role": "user", "content": prefix + unstructured_text}],
     return_full_text=False,
+    max_new_tokens=3000
 )[0]
 llm_output = output["generated_text"]