Llama2-7B-MIMIC-iii-Extraction-v1
Model Description
This model is a fine-tuned version of Llama-2-7b-chat-hf designed for Structured Clinical Information Extraction. It has been specifically trained to process unstructured clinical notes (discharge summaries) from the MIMIC-III database and transform them into a structured JSON format.
The model can identify and extract key medical entities such as:
- Drug names
- Dosages
- Frequency of administration
- Indications/Reasons for treatment
Training Procedure
The model was fine-tuned using QLoRA (4-bit quantization) to ensure efficiency and high performance.
Training Hyperparameters:
- Base Model: NousResearch/Llama-2-7b-chat-hf
- Method: LoRA (Low-Rank Adaptation)
- Max Sequence Length: 2048 tokens
- Learning Rate: 2e-4
- Batch Size: 1 (with 4 gradient accumulation steps)
- Optimizer: paged_adamw_32bit
- Precision: 4-bit (bitsandbytes)
LoRA Configuration:
- r (Rank): 16
- lora_alpha: 32
- Target Modules: q_proj, v_proj, k_proj, o_proj (Attention layers)
- lora_dropout: 0.05
Intended Use
This model is intended for researchers and developers working on clinical natural language processing (NLP). It is designed to assist in converting medical narratives into machine-readable data.
How to use:
To use this model, you need to load it as a PEFT (Adapter) on top of the base Llama-2-7b-chat-hf model.
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model_name = "NousResearch/Llama-2-7b-chat-hf"
adapter_model_name = "maherghanem86/PharmaCompass"
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, adapter_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)