BALM Paper
Collection
Models from the publication: "Improving antibody language models with native pairing", Patterns (2024) • 4 items • Updated
How to use brineylab/BALM-paired with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="brineylab/BALM-paired") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("brineylab/BALM-paired")
model = AutoModelForMaskedLM.from_pretrained("brineylab/BALM-paired")BALM-paired is an antibody language model that uses a RoBERTa architecture and was pre-trained on the ~1.6M paired antibody sequences from Jaffe et al. Datasets used for pre-training are available on Zenodo and code is available on GitHub. More details can be found in our paper published in Patterns.
Load the model and tokenizer as follows:
from transformers import RobertaTokenizer, RobertaForMaskedLM
model = RobertaForMaskedLM.from_pretrained("brineylab/BALM-paired")
tokenizer = RobertaTokenizer.from_pretrained("brineylab/BALM-paired")
The tokenizer expects sequences formatted as: HEAVY_CHAIN</s>LIGHT_CHAIN.