BALM Paper
Collection
Models from the publication: "Improving antibody language models with native pairing", Patterns (2024)
•
4 items
•
Updated
BALM-paired is an antibody language model that uses a RoBERTa architecture and was pre-trained on the ~1.6M paired antibody sequences from Jaffe et al. Datasets used for pre-training are available on Zenodo and code is available on GitHub. More details can be found in our paper published in Patterns.
Load the model and tokenizer as follows:
from transformers import RobertaTokenizer, RobertaForMaskedLM
model = RobertaForMaskedLM.from_pretrained("brineylab/BALM-paired")
tokenizer = RobertaTokenizer.from_pretrained("brineylab/BALM-paired")
The tokenizer expects sequences formatted as: HEAVY_CHAIN</s>LIGHT_CHAIN.