Biomni-R0-32B LoRA Adapter (Method A, Rank 256)

This is a LoRA adapter extracted from the Biomni-R0-32B model using the original Qwen3-32B as the base model.

Adapter Details

Parameter Value
Method Method A - Direct LoRA extraction
Base Model Qwen/Qwen3-32B
Fine-tuned Model biomni/Biomni-R0-32B-Preview
Rank (r) 256
Alpha 256
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Extraction Tool MergeKit (mergekit-extract-lora)

Usage with PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-32B",
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "hassanshka/Biomni-R0-32B-LoRA-Rank256")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-32B")

# Inference
messages = [{"role": "user", "content": "Your biomedical question here"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Usage with vLLM

from vllm import LLM, SamplingParams

llm = LLM(
    model="Qwen/Qwen3-32B",
    enable_lora=True,
    max_lora_rank=256
)

# Load LoRA at runtime
output = llm.generate(
    prompts,
    lora_request=LoRARequest("biomni", 1, "hassanshka/Biomni-R0-32B-LoRA-Rank256")
)

Extraction Process

The LoRA was extracted using MergeKit:

mergekit-extract-lora \
    --model "biomni/Biomni-R0-32B-Preview" \
    --base-model "Qwen/Qwen3-32B" \
    --out-path "./lora_output" \
    --max-rank 256 \
    --device cuda

License

Apache 2.0

Citation

If you use this adapter, please cite both the original Qwen3 and Biomni models.

Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hassanshka/Biomni-R0-32B-LoRA-Rank256

Base model

Qwen/Qwen3-32B
Adapter
(202)
this model