brineylab
/

preferential-250k

Model card Files Files and versions

preferential-250k / README.md

karennang's picture

Update README.md

72ebb7b verified about 1 month ago

|

history blame contribute delete

1.48 kB

	---
	library_name: transformers
	license: mit
	---

	## Preferential-250k
	Preferential-250k is an antibody language model that uses an [ESM-2](https://www.science.org/doi/10.1126/science.ade2574) architecture.
	It was pre-trained on paired sequences from [Jaffe et al.](https://www.nature.com/articles/s41586-022-05371-z) and [Hurtado et al.](https://doi.org/10.1016/j.celrep.2024.114307)
	Datasets used for pre-training are available on [Zenodo](https://doi.org/10.5281/zenodo.14019655) and code is available on [GitHub](https://github.com/brineylab/preferential-masking-paper).
	More details can be found in [our paper](https://doi.org/10.1016/j.patter.2025.101239) published in Patterns.

	### Use
	Load the model and tokenizer as follows:
	```python
	from transformers import EsmTokenizer, EsmForMaskedLM

	model = EsmForMaskedLM.from_pretrained("brineylab/preferential-250k")
	tokenizer = EsmTokenizer.from_pretrained("brineylab/preferential-250k")
	```

	The tokenizer expects sequences formatted as: `HEAVY_CHAIN<cls><cls>LIGHT_CHAIN`.

	The model can be finetuned for classification tasks (such as specificity and pair classification in the paper) by loading the model with a sequence classification head:
	```python
	from transformers import EsmForSequenceClassification

	model = EsmForSequenceClassification.from_pretrained("brineylab/preferential-250k")

	# freeze the base model weights prior to finetuning
	for param in model.base_model.parameters():
	param.requires_grad = False
	```