| --- |
| base_model: Qwen/Qwen3-8B |
| library_name: peft |
| pipeline_tag: text-generation |
| tags: |
| - qwen |
| - qwen3 |
| - lora |
| - peft |
| - biomedical-entity-linking |
| - clinical-nlp |
| - concept-normalization |
| - reranking |
| - candidate-ranking |
| - reasoning |
| - reinforcement-learning |
| license: other |
| --- |
| |
| # Qwen3-8B-LoRA-ContextBioEL-Reranker-RL |
|
|
| This repository provides a LoRA adapter for Qwen3-8B for the reranker stage of a clinical biomedical entity linking pipeline. |
|
|
| This model reranks a top-10 candidate list using the rewritten term, marked note context, and candidate semantic tags, and outputs the best concept_id. It was further optimized with reinforcement learning (RL) for entity-linking-oriented reranking behavior. |
| |
| ## Model type |
| |
| - Base model: Qwen/Qwen3-8B |
| - Adapter type: LoRA |
| - Stage: Reranker |
| - Training: RL |
| - Task: Context-aware biomedical entity linking reranking |
| |
| ## Intended use |
| |
| Inputs: |
| - `rewritten_term` |
| - `context_marked`, where the target mention is explicitly enclosed by `<mention>...</mention>` |
| - `candidates`, a top-10 candidate list containing: |
| - `concept_id` |
| - `concept_name` |
| - `semantic_tag` |
|
|
| Output: |
| - exactly one selected `concept_id` in the `<answer>...</answer>` block |
|
|
| This model is intended for research use in biomedical entity linking pipelines. |
|
|
| ## Important decoding note |
|
|
| This adapter was trained with reasoning-style outputs. |
|
|
| Please: |
| - enable thinking |
| - do not use greedy decoding |
|
|
| Recommended decoding: |
| - `do_sample=True` |
| - non-greedy decoding such as temperature/top-p sampling |
| - parse the final prediction from the `<answer>...</answer>` span |
|
|
| ## Usage example |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| from peft import PeftModel |
| import torch |
| import json |
| |
| base_model_path = "Qwen/Qwen3-8B" |
| adapter_path = "Tao-AI-Informatics/Qwen3-8B-LoRA-ContextBioEL-Reranker-RL" |
| |
| tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True) |
| base_model = AutoModelForCausalLM.from_pretrained( |
| base_model_path, |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| trust_remote_code=True, |
| ) |
| model = PeftModel.from_pretrained(base_model, adapter_path) |
| |
| cands_json = json.dumps([ |
| {"concept_id": "22298006", "concept_name": "myocardial infarction", "semantic_tag": "disorder"}, |
| {"concept_id": "57054005", "concept_name": "acute myocardial infarction", "semantic_tag": "disorder"} |
| ], indent=2) |
| |
| messages = [ |
| { |
| "role": "system", |
| "content": ( |
| "You are a clinical concept normalization model that reranks a top-10 candidate list using context and semantic tags.\n\n" |
| "Inputs you will receive:\n" |
| "- rewritten_term\n" |
| "- context_marked with <mention>...</mention>\n" |
| "- candidates: top-10 items (concept_id, concept_name, semantic_tag)\n\n" |
| "Think before answer\n\n" |
| "Output ONLY:\n" |
| "<think>...</think>\n" |
| "<answer>...</answer>\n\n" |
| "In <think>, write a detailed reasoning with these parts:\n" |
| "1) Context interpretation: what the mention means in this note (section cues, negation, experiencer, temporality).\n" |
| "2) Type inference: what semantic type/tag is expected (and why other tags are wrong).\n" |
| "3) Candidate comparison: evaluate multiple candidates. Note over-specific vs too-general, added qualifiers, and tag alignment.\n" |
| "4) Decision: justify the final choice.\n\n" |
| "In <answer>, use exactly one of:\n" |
| "- <answer><concept_id></answer>\n" |
| ), |
| }, |
| { |
| "role": "user", |
| "content": ( |
| "Task: Choose the best concept_id from candidates.\n\n" |
| "rewritten_term:\nacute myocardial infarction\n\n" |
| "context_marked:\n" |
| "The patient was admitted for <mention>heart attack</mention> yesterday.\n\n" |
| f"candidates (top10; no scores):\n{cands_json}" |
| ), |
| }, |
| ] |
| |
| text = tokenizer.apply_chat_template( |
| messages, |
| tokenize=False, |
| add_generation_prompt=True, |
| ) |
| |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=512, |
| do_sample=True, |
| temperature=0.6, |
| top_p=0.95, |
| ) |
| |
| print(tokenizer.decode(outputs[0], skip_special_tokens=False)) |
| ``` |
|
|
| ## Notes |
|
|
| - This is a LoRA adapter, not a standalone full model. |
|
|
| - The adapter is designed for the rewriting stage, not retrieval by itself. |
|
|
| - In downstream pipelines, the rewritten term is typically passed to a retriever or reranker. |
|
|
| ## Limitations |
|
|
| - This model is intended for research use only. |
|
|
| - Performance may vary across ontologies, institutions, and note styles. |
|
|
| - The model should be evaluated carefully before any real-world deployment. |
|
|
| - The final normalized term should be extracted from the <answer>...</answer> block. |
|
|
| ## Citation |
|
|
| If you use this model, please cite the associated paper when available. |