Update README.md

10bf2bf verified 9 days ago

2.27 kB

license: llama3.1
base_model: DUTIR-BioNLP/RexDrug-base
library_name: peft
pipeline_tag: text-generation
tags:
  - drug-combination
  - relation-extraction
  - biomedical
  - llama
  - chain-of-thought
  - lora
  - grpo

RexDrug-adapter

This is the LoRA adapter for RexDrug, trained via GRPO (Group Relative Policy Optimization) on top of RexDrug-base for biomedical drug combination relation extraction with chain-of-thought reasoning.

Model Details

Base model: DUTIR-BioNLP/RexDrug-base (Llama-3.1-8B-Instruct + SFT)
Fine-tuning method: GRPO with LoRA (r=64, alpha=128)
Task: Drug combination relation extraction from biomedical literature
Relation types: POS (beneficial), NEG (harmful), COMB (neutral/mixed), NO_COMB (no combination)

Quick Start

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# 1. Load model
tokenizer = AutoTokenizer.from_pretrained("DUTIR-BioNLP/RexDrug-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "dlutIR/RexDrug-base",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, "DUTIR-BioNLP/RexDrug-adapter")
model.eval()

# 2. Prepare input
messages = [
    {"role": "system", "content": "You are an expert in biomedical drug-drug relation extraction. ..."},
    {"role": "user",   "content": "Target sentence: ... \nContext paragraph: ..."},
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)

# 3. Generate
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

See the full example in the GitHub repository.

License

This model is built upon Llama 3.1 and is subject to the Llama 3.1 Community License Agreement.