Qwen3-8B-LoRA-ContextBioEL-Reranker-SFT

This repository provides a LoRA adapter for Qwen3-8B for the reranker stage of a clinical biomedical entity linking pipeline.

This model reranks a top-10 candidate list using the rewritten term, marked note context, and candidate semantic tags, and outputs the best concept_id. It was trained with supervised fine-tuning (SFT).

Model type

  • Base model: Qwen/Qwen3-8B
  • Adapter type: LoRA
  • Stage: Reranker
  • Training: SFT
  • Task: Context-aware biomedical entity linking reranking

Intended use

Inputs:

  • rewritten_term
  • context_marked, where the target mention is explicitly enclosed by <mention>...</mention>
  • candidates, a top-10 candidate list containing:
    • concept_id
    • concept_name
    • semantic_tag

Output:

  • exactly one selected concept_id in the <answer>...</answer> block

This model is intended for research use in biomedical entity linking pipelines.

Important decoding note

This adapter was trained with reasoning-style outputs.

Please:

  • enable thinking
  • do not use greedy decoding

Recommended decoding:

  • do_sample=True
  • non-greedy decoding such as temperature/top-p sampling
  • parse the final prediction from the <answer>...</answer> span

Usage example

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json

base_model_path = "Qwen/Qwen3-8B"
adapter_path = "Tao-AI-Informatics/Qwen3-8B-LoRA-ContextBioEL-Reranker-SFT"

tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_path)

cands_json = json.dumps([
    {"concept_id": "22298006", "concept_name": "myocardial infarction", "semantic_tag": "disorder"},
    {"concept_id": "57054005", "concept_name": "acute myocardial infarction", "semantic_tag": "disorder"}
], indent=2)

messages = [
    {
        "role": "system",
        "content": (
            "You are a clinical concept normalization model that reranks a top-10 candidate list using context and semantic tags.\n\n"
            "Inputs you will receive:\n"
            "- rewritten_term\n"
            "- context_marked with <mention>...</mention>\n"
            "- candidates: top-10 items (concept_id, concept_name, semantic_tag)\n\n"
            "Think before answer\n\n"
            "Output ONLY:\n"
            "<think>...</think>\n"
            "<answer>...</answer>\n\n"
            "In <think>, write a detailed reasoning with these parts:\n"
            "1) Context interpretation: what the mention means in this note (section cues, negation, experiencer, temporality).\n"
            "2) Type inference: what semantic type/tag is expected (and why other tags are wrong).\n"
            "3) Candidate comparison: evaluate multiple candidates. Note over-specific vs too-general, added qualifiers, and tag alignment.\n"
            "4) Decision: justify the final choice.\n\n"
            "In <answer>, use exactly one of:\n"
            "- <answer><concept_id></answer>\n"
        ),
    },
    {
        "role": "user",
        "content": (
            "Task: Choose the best concept_id from candidates.\n\n"
            "rewritten_term:\nacute myocardial infarction\n\n"
            "context_marked:\n"
            "The patient was admitted for <mention>heart attack</mention> yesterday.\n\n"
            f"candidates (top10; no scores):\n{cands_json}"
        ),
    },
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.6,
        top_p=0.95,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Notes

  • This is a LoRA adapter, not a standalone full model.

  • The adapter is designed for the rewriting stage, not retrieval by itself.

  • In downstream pipelines, the rewritten term is typically passed to a retriever or reranker.

Limitations

  • This model is intended for research use only.

  • Performance may vary across ontologies, institutions, and note styles.

  • The model should be evaluated carefully before any real-world deployment.

  • The final normalized term should be extracted from the ... block.

Citation

If you use this model, please cite the associated paper when available.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tao-AI-Informatics/Qwen3-8B-LoRA-ContextBioEL-Reranker-SFT

Finetuned
Qwen/Qwen3-8B
Adapter
(969)
this model

Collection including Tao-AI-Informatics/Qwen3-8B-LoRA-ContextBioEL-Reranker-SFT