JoyDaJun's picture
Update README.md
ebb90d1 verified
---
license: apache-2.0
library_name: peft
pipeline_tag: text-classification
tags:
- biomedical
- rag
- fact-checking
- nli
- entailment
- peft
- lora
- medragchecker
# IMPORTANT:
# - Do NOT set `base_model` to a local filesystem path. If you want to specify it, use a valid Hugging Face model id (org/name).
# - Because this repo contains adapters for multiple different base models, `base_model` is intentionally omitted.
---
# MedRAGChecker Student Checker — LoRA Adapters
This repository hosts **LoRA adapters** for the *checker* component used in the MedRAGChecker project.
The checker is trained as an NLI-style verifier to classify a **(evidence, claim)** pair into:
- **Entail**
- **Neutral**
- **Contradict**
These adapters are intended for **research and evaluation** (e.g., ensembling multiple checkers trained with different base models and/or training recipes such as SFT vs. GRPO).
> This repo contains **adapters only**. You must load each adapter on top of its corresponding **base model**.
---
## Contents
Each adapter subfolder typically includes:
- `adapter_config.json`
- `adapter_model.safetensors` (or `.bin`)
---
## Available adapters
All adapters live under the `Checker/` directory:
| Adapter subfolder | Base model (HF id) | Training recipe | Notes |
|---|---|---|---|
| `Checker/med42-llama3-8b-sft` | `<PUT_BASE_MODEL_ID_HERE>` | SFT | |
| `Checker/med42-llama3-8b-grpo` | `<PUT_BASE_MODEL_ID_HERE>` | GRPO | |
| `Checker/meditron-sft` | `<PUT_BASE_MODEL_ID_HERE>` | SFT | |
| `Checker/meditron-grpo` | `<PUT_BASE_MODEL_ID_HERE>` | GRPO | |
| `Checker/PMC_LLaMA_13B-sft` | `<PUT_BASE_MODEL_ID_HERE>` | SFT | |
| `Checker/qwen2-med-7b-sft` | `<PUT_BASE_MODEL_ID_HERE>` | SFT | |
| `Checker/qwen2-med-7b-grpo` | `<PUT_BASE_MODEL_ID_HERE>` | GRPO | |
### How to fill the “Base model (HF id)” column
Use a valid Hugging Face Hub model id (format: `org/name`). Examples:
- `meta-llama/Meta-Llama-3-8B-Instruct`
- `Qwen/Qwen2-7B-Instruct`
If your base model is **not** available on the Hub (only stored locally), you can either:
1) upload the base model to a private Hub repo and reference that id here, or
2) keep this field as `N/A (local)` and document your local loading instructions.
---
## Quickstart: load an adapter with PEFT
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
# 1) Choose the base model that matches the adapter you want to use
base_model_id = "<HF_BASE_MODEL_ID>"
# 2) Choose the adapter subfolder inside this repo
repo_id = "JoyDaJun/Medragchecker-Student-Checker"
subfolder = "Checker/qwen2-med-7b-sft" # example
tokenizer = AutoTokenizer.from_pretrained(base_model_id, use_fast=True)
model = AutoModelForSequenceClassification.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(model, repo_id, subfolder=subfolder)
```
> If your checker was trained using a **Causal LM** head instead of a sequence classification head,
replace `AutoModelForSequenceClassification` with `AutoModelForCausalLM` and use the same prompt/template as in training.
---
## Ensemble usage (optional)
If you trained multiple student checkers, you can ensemble them (e.g., by weighting each checker’s class probabilities using dev-set reliability such as per-class F1).
This often helps stabilize performance across **Entail / Neutral / Contradict**, especially under class imbalance.
---
## Limitations & responsible use
- **Not medical advice.** Do not use for clinical decision-making.
- Outputs may reflect biases or errors from training data and teacher supervision.
- Please evaluate on your target dataset and report limitations clearly.
---
## Citation
If you use these adapters, please cite your MedRAGChecker paper/project:
```bibtex
@article{medragchecker,
title={MedRAGChecker: A Claim-Level Verification Framework for Biomedical RAG},
author={...},
year={2025}
}
```