Uniform-350k

Uniform-350k is an antibody language model that uses an ESM-2 architecture. It was pre-trained on paired sequences from Jaffe et al. and Hurtado et al. Datasets used for pre-training are available on Zenodo and code is available on GitHub. More details can be found in our paper published in Patterns.

Use

Load the model and tokenizer as follows:

from transformers import EsmTokenizer, EsmForMaskedLM

model = EsmForMaskedLM.from_pretrained("brineylab/uniform-350k")
tokenizer = EsmTokenizer.from_pretrained("brineylab/uniform-350k")

The tokenizer expects sequences formatted as: HEAVY_CHAIN<cls><cls>LIGHT_CHAIN.

The model can be finetuned for classification tasks (such as specificity and pair classification in the paper) by loading the model with a sequence classification head:

from transformers import EsmForSequenceClassification

model = EsmForSequenceClassification.from_pretrained("brineylab/uniform-350k")

# freeze the base model weights prior to finetuning
for param in model.base_model.parameters():
    param.requires_grad = False

Downloads last month: 3

Safetensors

Model size

0.4B params

Tensor type

F32

Collection including brineylab/uniform-350k

Preferential Masking

Collection

3 items • Updated Dec 10, 2025