BALM Paper
Collection
Models from the publication: "Improving antibody language models with native pairing", Patterns (2024)
•
4 items
•
Updated
BALM-unpaired is an antibody language model that uses a RoBERTa architecture and was pre-trained on unpaired antibody sequences from Jaffe et al. Datasets used for pre-training are available on Zenodo and code is available on GitHub. More details can be found in our paper published in Patterns.
Load the model and tokenizer as follows:
from transformers import RobertaTokenizer, RobertaForMaskedLM
model = RobertaForMaskedLM.from_pretrained("brineylab/BALM-unpaired")
tokenizer = RobertaTokenizer.from_pretrained("brineylab/BALM-unpaired")
The tokenizer expects unpaired sequences, either HEAVY_CHAIN or LIGHT_CHAIN.