PM234
/

DeepSeek-R1-MedExpert-LoRA-8B-bnb4bit

Model card Files Files and versions

PM234 commited on Mar 10, 2025

Commit

d678d93

·

verified ·

1 Parent(s): 619dfdc

Update code example in Readme

Files changed (1) hide show

README.md +14 -19

README.md CHANGED Viewed

@@ -12,13 +12,14 @@ tags:
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
-This is a LoRA adapter-based fine-tuned version of DeepSeek-R1-Distill-Llama-8B, optimized for Medical Question Answering (MedQA) using PEFT, LoRA adapters, and bnb-4bit quantization. The fine-tuning was performed on a curated dataset containing medical questions and answers from trusted sources.
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
@@ -77,27 +78,21 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
-from transformers import AutoModelForCausalLM, AutoTokenizer
-from peft import PeftModel
-base_model = "unsloth/DeepSeek-R1-Distill-Llama-8B-bnb-4bit"
-adapter_model = "PM234/DeepSeek-R1-MedExpert-LoRA-8B-bnb4bit"
-#### Load tokenizer from the base model
-tokenizer = AutoTokenizer.from_pretrained(base_model)
-#### Load base model
-model = AutoModelForCausalLM.from_pretrained(
-    base_model,
-    torch_dtype="auto",
-    device_map="auto"
-)
-#### Load LoRA adapter on top of the base model
-model = PeftModel.from_pretrained(model, adapter_model)
 [More Information Needed]

 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+This is a LoRA adapter-based fine-tuned version of DeepSeek-R1-Distill-Llama-8B, optimized for Medical Question Answering (MedQA) using PEFT, LoRA adapters, and bnb-4bit quantization. The fine-tuning was performed on a curated dataset of 10k examples containing medical questions and answers from trusted sources.
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
 Use the code below to get started with the model.
+```python
+from unsloth import FastLanguageModel
+# Load model + adapters directly
+model, tokenizer = FastLanguageModel.from_pretrained("PM234/DeepSeek-R1-MedExpert-LoRA-8B-bnb4bit")
+# Prep for inference
+FastLanguageModel.for_inference(model)
+# Example:
+test_input = "Below is an instruction...\n### Instruction: Answer the following medical question.\n### Input: What is the primary source of energy for the human body?\n### Response:"
+inputs = tokenizer(test_input, return_tensors="pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens=20)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))  # "Glucose"
+```
 [More Information Needed]