Instructions to use protgpt3/ProtGPT3-112m-dpo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use protgpt3/ProtGPT3-112m-dpo with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="protgpt3/ProtGPT3-112m-dpo")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("protgpt3/ProtGPT3-112m-dpo")
model = AutoModelForCausalLM.from_pretrained("protgpt3/ProtGPT3-112m-dpo")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use protgpt3/ProtGPT3-112m-dpo with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "protgpt3/ProtGPT3-112m-dpo"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "protgpt3/ProtGPT3-112m-dpo",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/protgpt3/ProtGPT3-112m-dpo

SGLang

How to use protgpt3/ProtGPT3-112m-dpo with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "protgpt3/ProtGPT3-112m-dpo" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "protgpt3/ProtGPT3-112m-dpo",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "protgpt3/ProtGPT3-112m-dpo" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "protgpt3/ProtGPT3-112m-dpo",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use protgpt3/ProtGPT3-112m-dpo with Docker Model Runner:
```
docker model run hf.co/protgpt3/ProtGPT3-112m-dpo
```

Model Card for ProtGPT3-112M-dpo

Model Details

Model Description

ProtGPT3-112M-dpo is a DPO-aligned single-sequence autoregressive protein language model for protein sequence generation. It is part of the ProtGPT3 family, an open-source suite of promptable and aligned protein language models for protein design.

The base ProtGPT3-112M model is a causal decoder-only protein language model using a Mixtral-style sparse Mixture-of-Experts architecture. It was trained for causal language modeling on protein sequences and supports generation in both N-to-C and C-to-N directions using special directional tokens.

This checkpoint was further aligned with Direct Preference Optimization (DPO) to improve generation quality. The alignment procedure shifts the model toward protein sequences with higher predicted structural confidence and reduced low-complexity content, while preserving sequence diversity.

Developed by: Anonymous authors
Model type: DPO-aligned autoregressive protein language model; causal decoder-only Mixture-of-Experts model
Language(s): Protein sequences / amino-acid sequences
License: More Information Needed
Finetuned from model: protgpt3/ProtGPT3-112M

Model Sources

Repository: https://huggingface.co/protgpt3
Paper: ProtGPT3: an Open-source family of Promptable and Aligned Protein Language Models
Code: https://anonymous.4open.science/r/protGPT3-2053/README.md

Uses

Direct Use

ProtGPT3-112M-dpo can be used for single-sequence autoregressive protein generation. Users can generate protein sequences unconditionally or condition generation on an amino-acid prefix.

Compared with the base ProtGPT3-112M checkpoint, this DPO-aligned model is intended for users who want generations biased toward higher-complexity sequences with improved predicted structural confidence.

Downstream Use

The model may be used in protein design workflows, computational screening pipelines, protein variant generation, and candidate sequence proposal. Generated sequences can be further evaluated with structure prediction, sequence-complexity filters, solubility filters, fitness predictors, or experimental validation.

Out-of-Scope Use

The model should not be used as the sole basis for experimental, clinical, environmental, or safety-critical decisions. Generated sequences require downstream computational and experimental validation. The model is not guaranteed to generate functional, soluble, safe, synthesizable, or experimentally successful proteins.

The model should not be used for irresponsible or harmful biological design applications.

Bias, Risks, and Limitations

ProtGPT3-112M-dpo learns from public protein sequence datasets and may reproduce biases present in those datasets. Although DPO alignment reduces low-complexity generations and improves generation quality according to the alignment objectives (pLDDT and reduction of lcr, as a binary objective, see main manuscript), generated sequences may still be nonfunctional, unstable, insoluble, repetitive, biologically implausible, or unsuitable for a user’s intended application.

The DPO alignment objective uses predicted structural confidence and low-complexity filtering as proxy objectives. These proxies do not guarantee biological function, experimental success, safety, solubility, or manufacturability.

As with other generative protein models, ProtGPT3-112M-dpo may present dual-use risks if applied irresponsibly.

Recommendations

Users should validate generated sequences with appropriate downstream computational and experimental methods. Recommended checks include sequence-complexity filtering, structure prediction, predicted confidence scoring, similarity searches against known proteins, solubility assessment, and task-specific functional evaluation.

How to Get Started with the Model

Install dependencies:

pip install transformers accelerate torch

Load the model and tokenizer:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "protgpt3/ProtGPT3-112M-dpo"

# Load tokenizer for generation
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True,add_bos_token=True, add_eos_token=False, padding_side="left")

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

model.eval()

Generate a protein sequence

import torch

prompt = ""  # Optionally provide an amino-acid prefix or model-specific direction

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output_ids = model.generate(
        inputs["input_ids"],
        max_new_tokens=512,
        do_sample=True,
        temperature=0.8,
        top_p=0.9,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id,
    )

sequence = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(sequence) # output includes directional token "1" or "2" to denote if sequence was generated N-to-C or C-to-N

Generate from an amino-acid prefix

import torch

# forward N-to-C generation with special token "1" 
prefix = "1MKT" # use special token "2" instead of "1" for reverse  C-to-N generation

inputs = tokenizer(prefix, return_tensors="pt").to(model.device)

with torch.no_grad():
    output_ids = model.generate(
        inputs["input_ids"],
        max_new_tokens=256,
        do_sample=True,
        temperature=0.8,
        top_p=0.9,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id,
    )

sequence = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(sequence)

Batch generation

import torch

prompts = [
    "",
    "1MKT", # N-to-C generation
    "2MAV", # C-to-N generation
]

inputs = tokenizer(
    prompts,
    return_tensors="pt",
    padding=True,
).to(model.device)

with torch.no_grad():
    output_ids = model.generate(
        inputs["input_ids"],
        max_new_tokens=256,
        do_sample=True,
        temperature=0.8,
        top_p=0.9,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.bos_token_id,
    )

sequences = tokenizer.batch_decode(output_ids, skip_special_tokens=True)

for sequence in sequences:
    print(sequence)

Notes on generation

Use this checkpoint for single-sequence protein generation.
Sampling parameters such as temperature and top_p can strongly affect sequence quality and diversity.
Lower temperatures may produce more conservative sequences.
Higher temperatures may increase diversity but can also increase failure modes.
Generated sequences should be validated before experimental use.

Training Details

Training Data

The base ProtGPT3-112M model was trained on publicly available protein sequence data from UniRef90 and the GigaRef subset of the Dayhoff Atlas. The 112M-parameter model used approximately 64M UniRef90 sequences and 120M GigaRef sequences, corresponding to approximately 43B training tokens.

The DPO alignment dataset was constructed from model-generated sequences. Sequences were scored using predicted structural confidence and low-complexity-region content. Sequences with pLDDT greater than 0.7 and fewer than 25% low-complexity residues were treated as positive examples, while the remaining generations were treated as negative examples.

Training Procedure

Preprocessing

For base-model pretraining, protein sequences were sampled from UniRef90 and GigaRef. During training, each sequence was assigned a generation direction, either N-to-C or C-to-N, with a special token prepended to indicate the direction.

For DPO alignment, generated sequences were classified as pass or fail according to predicted pLDDT and low-complexity-region thresholds. Pass and fail sequences were clustered separately at 50% sequence identity and 0.8 coverage. Preference pairs were constructed by pairing positive and negative examples with matched sequence lengths, helping prevent the model from learning sequence length as a shortcut.

Training Hyperparameters

Base ProtGPT3-112M pretraining:

Training regime: bfloat16
Architecture: Mixtral-style sparse Mixture-of-Experts causal decoder
Maximum sequence length: 1024
Optimizer: AdamW
Learning rate: 2.5e-4
Optimizer betas: β1 = 0.9, β2 = 0.999
Weight decay: 0.1
Gradient clipping: 1.0
Gradient accumulation steps: 4
Batch size: 100
Router auxiliary loss coefficient: 0.05
Number of training GPUs: 16
Precision: bfloat16

DPO alignment:

Alignment method: Direct Preference Optimization
Positive-example criterion: pLDDT > 0.7 and low-complexity regions < 25%
Negative-example criterion: all other generated sequences
Pairing strategy: length-matched positive and negative sequence pairs
Preference-data clustering: 50% sequence identity, 0.8 coverage
Alignment objective: shift the model toward higher-complexity, higher-pLDDT generations

Speeds, Sizes, Times

Model size: 112M parameters
Base-model training tokens: Approximately 43B
Hardware: NVIDIA H100 GPUs

Evaluation

Testing Data, Factors & Metrics

Testing Data

ProtGPT3 models were evaluated on held-out protein sequences with at most 50% sequence identity to the training set. The model family was also benchmarked on ProteinGym and assessed for generation quality across sampling settings.

The DPO-aligned models were evaluated on generated sequences and on naturally occurring protein sequences from PDB-derived data to assess whether the alignment objective generalized beyond the model-generated preference data.

Factors

Evaluation considered model scale, sampling temperature, nucleus sampling parameter top_p, sequence direction, predicted structure confidence, low-complexity-region content, and sequence diversity.

Metrics

Evaluation included:

Validation perplexity
ProteinGym Spearman correlation
Predicted pLDDT
Fraction of low-complexity generations
Sequence diversity
Fraction of sequences passing the pLDDT and low-complexity filters
Intrinsic reward discrimination between high-quality and low-quality natural sequences

Results

DPO alignment improved generation quality across the ProtGPT3 single-sequence model family. Alignment reduced the fraction of low-complexity generations while preserving high predicted structural confidence and sequence diversity.

For the 112M-scale model, DPO alignment increased the pass rate of generated sequences under the pLDDT and low-complexity criteria. The paper reports that alignment reduced low-complexity generations by more than 20% for the 112M and 1B-scale models, while preserving diversity and causing little change in held-out pretraining perplexity.

Summary

ProtGPT3-112M-dpo is the DPO-aligned version of ProtGPT3-112M. It is intended for users who want a single-sequence protein generator biased toward higher-complexity and higher-predicted-confidence generations compared with the base checkpoint.

Model Examination

ProtGPT3-112M-dpo was examined as part of the ProtGPT3 alignment study. The DPO alignment pipeline was designed to reduce repetitive or low-complexity protein generations while maintaining diversity and preserving base-model knowledge.

The aligned models were also examined using an intrinsic reward discrimination analysis on real protein sequences, where aligned models assigned systematically higher intrinsic rewards to high-quality sequences than to low-quality sequences.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

Hardware Type: NVIDIA H100 GPUs
Hours used: More Information Needed
Cloud Provider: More Information Needed
Compute Region: More Information Needed
Carbon Emitted: More Information Needed

Technical Specifications

Model Architecture and Objective

ProtGPT3-112M-dpo is a decoder-only autoregressive protein language model using a Mixtral-style sparse Mixture-of-Experts architecture. The base model was trained with a causal language modeling objective on protein sequences.

The DPO-aligned checkpoint was optimized to prefer generated sequences with higher predicted structural confidence and lower low-complexity-region content.

Compute Infrastructure

Hardware

The base ProtGPT3-112M model was trained on NVIDIA H100 GPUs.

Software

Training used FlashAttention-2, online mini-batch packing, Liger Kernel, and DeepSpeed.

Citation

BibTeX:

@article{protgpt3,
  title={ProtGPT3: an Open-source family of Promptable and Aligned Protein Language Models},
  author={Anonymous Authors},
  year={2026}
}

APA:

Anonymous Authors. (2026). ProtGPT3: an Open-source family of Promptable and Aligned Protein Language Models.

Glossary

DPO: Direct Preference Optimization, an alignment method that optimizes a model using preference pairs.
pLDDT: A predicted local structure confidence score.
Low-complexity region: A repetitive or compositionally simple sequence region.
Causal language modeling: Autoregressive prediction of the next token given previous tokens.
Mixture-of-Experts: A sparse neural architecture using multiple expert subnetworks.
N-to-C / C-to-N: Protein sequence generation directions from N-terminus to C-terminus or C-terminus to N-terminus.

More Information

All models and code are released through the Hugging Face ecosystem and accompanying code repository.

Model Card Authors

Anonymous authors

Model Card Contact

More Information Needed

Downloads last month: 110

Safetensors

Model size

0.1B params

Tensor type

BF16