SDTM Mapping LoRA Model (Qwen2.5-1.5B)
This repository contains a LoRA fine-tuned model for mapping raw clinical data variables to CDISC SDTM domains and variables, with concise reasoning.
The model is intended to assist in clinical data standardization workflows, not to replace expert review.
Task
Given a raw variable from an EDC or source dataset, the model predicts:
- SDTM_DOMAIN
- SDTM_VARIABLE
- REASONING (one concise sentence)
Base Model
- Qwen/Qwen2.5-1.5B
This repository contains LoRA adapters only.
The base model must be loaded separately.
ποΈ Training Details
- Method: LoRA (Parameter-Efficient Fine-Tuning)
- Epochs: 2
- Task Type: Instruction-following, structured JSON output
- Data:
- De-identified
- Synthetic + curated SDTM mapping examples
- No confidential or patient-level data
Output Format
The model is designed to return JSON only:
{
"SDTM_DOMAIN": "AE",
"SDTM_VARIABLE": "AESEV",
"REASONING": "AESEV represents the severity of an adverse event and is a standard variable in the AE domain."
}
## Load Model
---------------
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
BASE_MODEL = "Qwen/Qwen2.5-1.5B"
LORA_REPO = "karamalanagendra/sdtm-qwen-lora"
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
tokenizer.pad_token_id = tokenizer.eos_token_id
base_model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL,
device_map="cpu" # or "cuda" if GPU available
)
model = PeftModel.from_pretrained(base_model, LORA_REPO)
model.eval()
-------------------
## Sample Test
-------------------
prompt = """
You are an SDTM mapping engine.
Return ONLY valid JSON.
Do NOT explain.
Do NOT add text outside JSON.
Instruction:
Map the raw variable to SDTM.
Input:
Table: AE_RAW
Variable: AEACN2
Description: Action taken with Study Medication
Output JSON (return ONE object only):
{
"SDTM_DOMAIN": "",
"SDTM_VARIABLE": "",
"REASONING": ""
}
"""
raw_output = cpu_generate_json(prompt)
parsed = extract_last_json(raw_output)
print(parsed)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for karamalanagendra/sdtm-qwen-lora
Base model
Qwen/Qwen2.5-1.5B