--- license: apache-2.0 base_model: arcee-ai/Trinity-Mini library_name: peft pipeline_tag: text-generation tags: - lora - peft - grpo - reinforcement-learning - biomedical - relation-extraction - drug-protein - moe language: - en ---

Trinity-Mini-DrugProt-Think
RLVR (GRPO) + LoRA post-training on Arcee Trinity Mini for DrugProt relation classification.

📝 Report | AWS deployment guide | GitHub

# Trinity-Mini-DrugProt-Think A LoRA adapter fine-tuned on [Arcee Trinity Mini](https://huggingface.co/arcee-ai/Trinity-Mini) using GRPO (Group Relative Policy Optimization) for **drug-protein relation extraction** on the [DrugProt (BioCreative VII)](https://huggingface.co/datasets/OpenMed/drugprot-parquet) benchmark. The model classifies 13 types of drug-protein interactions from PubMed abstracts, producing structured pharmacological reasoning traces before giving its answer. ## Model Details | Property | Value | |---|---| | Base Model | [arcee-ai/Trinity-Mini](https://huggingface.co/arcee-ai/Trinity-Mini) | | Architecture | Sparse MoE (26B total / 3B active) | | Fine-tuning Method | LoRA (Low-Rank Adaptation) | | Training Method | GRPO (Reinforcement Learning) | | Training Data | [maziyar/OpenMed_DrugProt](https://huggingface.co/datasets/OpenMed/drugprot-parquet) | | Task | Drug-protein relation extraction (13-way classification) | | Trainable Parameters | LoRA rank=16, all projection layers | | License | Apache 2.0 | ## Training Configuration | Parameter | Value | |---|---| | LoRA Alpha (α) | 64 | | LoRA Rank | 16 | | Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj + experts | | Learning Rate | 3e-6 | | Batch Size | 128 | | Rollouts per Example | 8 | | Max Generation Tokens | 2048 | | Temperature | 0.7 | ## Quick Start **Installation** ```bash pip install transformers peft torch accelerate ``` **Usage** ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer import torch base_model_id = "arcee-ai/Trinity-Mini" adapter_id = "lokahq/Trinity-Mini-DrugProt-Think" tokenizer = AutoTokenizer.from_pretrained(base_model_id) model = AutoModelForCausalLM.from_pretrained( base_model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) model = PeftModel.from_pretrained(model, adapter_id) messages = [ { "role": "system", "content": ( "You are an expert biomedical relation extraction assistant. Your task is to identify the type of interaction between a drug/chemical and a gene/protein in biomedical text.\n\n" "For each question:\n" "1. First, wrap your detailed biomedical reasoning inside tags\n" "2. Analyze the context around both entities to understand their relationship\n" "3. Consider the pharmacological and molecular mechanisms involved\n" "4. Then provide your final answer inside \\boxed{} using exactly one letter (A-M)\n\n" "The 13 DrugProt relation types are:\n" "A. INDIRECT-DOWNREGULATOR - Chemical indirectly decreases protein activity/expression\n" "B. INDIRECT-UPREGULATOR - Chemical indirectly increases protein activity/expression\n" "C. DIRECT-REGULATOR - Chemical directly regulates protein (mechanism unspecified)\n" "D. ACTIVATOR - Chemical activates the protein\n" "E. INHIBITOR - Chemical inhibits the protein\n" "F. AGONIST - Chemical acts as an agonist of the receptor/protein\n" "G. AGONIST-ACTIVATOR - Chemical is both agonist and activator\n" "H. AGONIST-INHIBITOR - Chemical is agonist but inhibits downstream effects\n" "I. ANTAGONIST - Chemical acts as an antagonist of the receptor/protein\n" "J. PRODUCT-OF - Chemical is a product of the enzyme\n" "K. SUBSTRATE - Chemical is a substrate of the enzyme\n" "L. SUBSTRATE_PRODUCT-OF - Chemical is both substrate and product\n" "M. PART-OF - Chemical is part of the protein complex\n\n" "Example format:\n" "\n" "The text describes [chemical] and [protein]. Based on the context...\n" "- The phrase \"[relevant text]\" indicates that...\n" "- This suggests a [type] relationship because...\n" "\n" "\\boxed{A}" ) }, { "role": "user", "content": ( "Abstract: [PASTE PUBMED ABSTRACT HERE]\n\n" "Chemical entity: [DRUG NAME]\n" "Protein entity: [PROTEIN NAME]\n\n" "What is the relationship between the chemical and protein entities? " "Choose from: A) INHIBITOR B) SUBSTRATE C) INDIRECT-DOWNREGULATOR " "D) INDIRECT-UPREGULATOR E) AGONIST F) ANTAGONIST G) ACTIVATOR " "H) PRODUCT-OF I) AGONIST-ACTIVATOR J) INDIRECT-UPREGULATOR " "K) PART-OF L) SUBSTRATE_PRODUCT-OF M) NOT\n\n" "Think step by step, then provide your answer in \\boxed{} format." ) } ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7, top_p=0.75) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Progress Training ran for ~100 steps on Prime Intellect infrastructure. Best accuracy reward reached ~0.83 during training. ## Limitations - This is a LoRA adapter and requires the base model ([arcee-ai/Trinity-Mini](https://huggingface.co/arcee-ai/Trinity-Mini)) to run - Evaluated on training-split held-out data; not yet benchmarked on the official DrugProt test set - Optimized specifically for 13-way DrugProt classification; may not generalize to other biomedical RE tasks ## Citation

@misc{jakimovski2026drugprotrl,
  title        = {Post-Training an Open MoE Model to Extract Drug-Protein Relations: Trinity-Mini-DrugProt-Think},
  author       = {Jakimovski, Bojan and Kalinovski, Petar},
  year         = {2026},
  month        = feb,
  howpublished = {Blog post},
  url          = {https://github.com/LokaHQ/Trinity-Mini-DrugProt-Think}
}

``` ## Acknowledgements - [Arcee AI](https://www.arcee.ai/) for the Trinity Mini base model - [Prime Intellect](https://www.primeintellect.ai/) for training infrastructure - [maziyar](https://huggingface.co/maziyar) for the OpenMed DrugProt RL environment - [Hugging Face](https://huggingface.co/) for the PEFT library ## Authors [Bojan Jakimovski](mailto:bojan.jakimovski@loka.com) · [Petar Kalinovski](mailto:petar.kalinovski@loka.com) · [Loka](https://loka.com)