praveenramesh commited on
Commit
c347354
·
verified ·
1 Parent(s): c7cc2d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -14,12 +14,17 @@ licence: license
14
  This model is a fine-tuned version of [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
- The ClinicalIntelligence/saama_gemma is a fine-tuned MedGemma model designed to transform unstructured clinical narratives—such as discharge notes—into structured, SDTM-aligned datasets (e.g., Adverse Events, Medical History, Procedures). Trained on an SME-curated dataset derived from MIMIC-III, the model treats clinical data extraction as a complex reasoning task, explicitly evaluating assertion, temporality, and causality to generate accurate, traceable JSON outputs. By learning regulatory semantics directly, it significantly outperforms base models in domain grounding and schema consistency. Users should note current limitations regarding context window constraints for lengthy notes, rare abbreviation handling, and the resolution of multi-domain entities.
18
 
19
 
20
 
 
 
 
 
21
 
22
  ## Quick start
 
23
 
24
  ```python
25
  import re
@@ -36,6 +41,7 @@ generator = pipeline(
36
  output = generator(
37
  [{"role": "user", "content": prefix + unstructured_text}],
38
  return_full_text=False,
 
39
  )[0]
40
  llm_output = output["generated_text"]
41
 
 
14
  This model is a fine-tuned version of [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
+ The [ClinicalIntelligence/saama_gemma](https://huggingface.co/ClinicalIntelligence/saama_gemma) is a fine-tuned MedGemma model designed to transform unstructured clinical narratives—such as discharge notes—into structured, SDTM-aligned datasets (e.g., Adverse Events, Medical History, Procedures). Trained on an SME-curated dataset derived from MIMIC-III, the model treats clinical data extraction as a complex reasoning task, explicitly evaluating assertion, temporality, and causality to generate accurate, traceable JSON outputs. By learning regulatory semantics directly, it significantly outperforms base models in domain grounding and schema consistency. Users should note current limitations regarding context window constraints for lengthy notes, rare abbreviation handling, and the resolution of multi-domain entities.
18
 
19
 
20
 
21
+ ## INSTALLATION
22
+ ```
23
+ pip install -U transformers
24
+ ```
25
 
26
  ## Quick start
27
+ **NOTE** - Adjust the **max_new_tokens** parameter as needed; it is set to 3000 by default.
28
 
29
  ```python
30
  import re
 
41
  output = generator(
42
  [{"role": "user", "content": prefix + unstructured_text}],
43
  return_full_text=False,
44
+ max_new_tokens=3000
45
  )[0]
46
  llm_output = output["generated_text"]
47