Biomni-R0-32B LoRA Adapter (Method A, Rank 256)

This is a LoRA adapter extracted from the Biomni-R0-32B model using the original Qwen3-32B as the base model.

Adapter Details

Parameter Value
Method Method A - Direct LoRA extraction
Base Model Qwen/Qwen3-32B
Fine-tuned Model biomni/Biomni-R0-32B-Preview
Rank (r) 256
Alpha 256
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Extraction Tool MergeKit (mergekit-extract-lora)

Usage with PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-32B",
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "hassanshka/Biomni-R0-32B-LoRA-Rank256")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-32B")

# Inference
messages = [{"role": "user", "content": "Your biomedical question here"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Usage with vLLM

from vllm import LLM, SamplingParams

llm = LLM(
    model="Qwen/Qwen3-32B",
    enable_lora=True,
    max_lora_rank=256
)

# Load LoRA at runtime
output = llm.generate(
    prompts,
    lora_request=LoRARequest("biomni", 1, "hassanshka/Biomni-R0-32B-LoRA-Rank256")
)

Extraction Process

The LoRA was extracted using MergeKit:

mergekit-extract-lora \
    --model "biomni/Biomni-R0-32B-Preview" \
    --base-model "Qwen/Qwen3-32B" \
    --out-path "./lora_output" \
    --max-rank 256 \
    --device cuda

License

Apache 2.0

Citation

If you use this adapter, please cite both the original Qwen3 and Biomni models.

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hassanshka/Biomni-R0-32B-LoRA-Rank256

Base model

Qwen/Qwen3-32B
Adapter
(231)
this model