SDTM Mapping LoRA Model (Qwen2.5-1.5B)

This repository contains a LoRA fine-tuned model for mapping raw clinical data variables to CDISC SDTM domains and variables, with concise reasoning.

The model is intended to assist in clinical data standardization workflows, not to replace expert review.


Task

Given a raw variable from an EDC or source dataset, the model predicts:

  • SDTM_DOMAIN
  • SDTM_VARIABLE
  • REASONING (one concise sentence)

Base Model

  • Qwen/Qwen2.5-1.5B

This repository contains LoRA adapters only.
The base model must be loaded separately.


πŸ‹οΈ Training Details

  • Method: LoRA (Parameter-Efficient Fine-Tuning)
  • Epochs: 2
  • Task Type: Instruction-following, structured JSON output
  • Data:
    • De-identified
    • Synthetic + curated SDTM mapping examples
    • No confidential or patient-level data

Output Format

The model is designed to return JSON only:

{
  "SDTM_DOMAIN": "AE",
  "SDTM_VARIABLE": "AESEV",
  "REASONING": "AESEV represents the severity of an adverse event and is a standard variable in the AE domain."
}

## Load Model
---------------

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE_MODEL = "Qwen/Qwen2.5-1.5B"
LORA_REPO = "karamalanagendra/sdtm-qwen-lora"

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
tokenizer.pad_token_id = tokenizer.eos_token_id

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    device_map="cpu"  # or "cuda" if GPU available
)

model = PeftModel.from_pretrained(base_model, LORA_REPO)
model.eval()

-------------------
## Sample Test
-------------------

prompt = """
You are an SDTM mapping engine.
Return ONLY valid JSON.
Do NOT explain.
Do NOT add text outside JSON.

Instruction:
Map the raw variable to SDTM.

Input:
Table: AE_RAW
Variable: AEACN2
Description: Action taken with Study Medication

Output JSON (return ONE object only):
{
  "SDTM_DOMAIN": "",
  "SDTM_VARIABLE": "",
  "REASONING": ""
}


"""

raw_output = cpu_generate_json(prompt)
parsed = extract_last_json(raw_output)

print(parsed)





Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for karamalanagendra/sdtm-qwen-lora

Base model

Qwen/Qwen2.5-1.5B
Adapter
(463)
this model