YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
base_model: Qwen/Qwen2.5-1.5B-Instruct
library_name: peft
tags:
- medical
- asr-correction
- robustness
- qlora
π₯ Edge-Native Medical Phonetic Denoiser (1.5B Adapter)
Author: Yash Sharma
Context: Developed as a reproduction of the paper "Evaluating Robustness in LLM-based Medical Chatbots" (Wadhwani AI).
π― Model Description
This is a QLoRA adapter fine-tuned on Qwen/Qwen2.5-1.5B-Instruct. It is designed to run on resource-constrained edge devices (T4 GPU / CPU) to correct severe ASR (Automatic Speech Recognition) phonetic errors in medical queries.
- Recovery: Improves RAG retrieval recall from 34% (Noisy) to 52% (Denoised).
- Training: Trained on 600 samples of "Brutally Noised" HealthCareMagic data.
- Compute: 4-bit Quantization (NF4).
π How to Use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# 1. Load Base Model
base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# 2. Load This Adapter
model = PeftModel.from_pretrained(model, "YOUR_HF_USERNAME/Qwen2.5-1.5B-Medical-Denoise-Adapter")
# 3. Inference
noisy_query = "wat r da simptoms of birus"
prompt = f"User: Fix this medical query: {noisy_query}\nAssistant:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Output: "What are the symptoms of virus"
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support