che111
/

AlphaMed-3B-instruct-rl

Model card Files Files and versions

che111 commited on Jun 1, 2025

Commit

f7785a6

·

verified ·

1 Parent(s): 2de3b84

Update README.md

Files changed (1) hide show

README.md +39 -3

README.md CHANGED Viewed

@@ -1,3 +1,39 @@
----
-license: mit
----

+---
+license: mit
+---
+# 🧠 AlphaMed
+This is the official model checkpoint for the paper:
+**[AlphaMed: Incentivizing Medical Reasoning with Reinforcement Learning Only](https://www.arxiv.org/abs/2505.17952)**
+AlphaMed is a medical large language model trained **without supervised fine-tuning or chain-of-thought (CoT) data**, relying solely on reinforcement learning to elicit step-by-step reasoning in complex medical tasks.
+## 🚀 Usage
+To use the model, format your input prompt as:
+> **Question:** [your medical question here]
+> **Please reason step by step, and put the final answer in \boxed{}**
+### 🔬 Example
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
+# Load model and tokenizer
+model_id = "your-hf-username/med-r1-zero"  # Replace with actual repo path
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id)
+pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
+# Format question
+prompt = (
+    "Question: A 45-year-old patient presents with chest pain radiating to the left arm and elevated troponin levels. "
+    "What is the most likely diagnosis?\n"
+    "Please reason step by step, and put the final answer in \\boxed{}"
+)
+# Generate output
+output = pipe(prompt, max_new_tokens=256, do_sample=False)[0]["generated_text"]
+print(output)