westlake-repl
/

SaProt_650M_PDB

Model card Files Files and versions

SaProt_650M_PDB / README.md

LTEnjoy's picture

Update README.md

6f7468f verified about 1 year ago

|

history blame contribute delete

1.47 kB

	---
	license: mit
	---

	<!-- ##### 🔴 <font color=red>Note: SaProt requires structural (SA token) input for optimal performance. AA-sequence-only mode works but must be finetuned - frozen embeddings work only for SA, not AA sequences! With structural input, SaProt surpasses ESM2 in most tasks.</font> -->

	We provide two ways to use SaProt, including through huggingface class and
	through the same way as in [esm github](https://github.com/facebookresearch/esm). Users can choose either one to use.

	### Huggingface model
	The following code shows how to load the model.
	```
	from transformers import EsmTokenizer, EsmForMaskedLM

	model_path = "/your/path/to/SaProt_650M_PDB"
	tokenizer = EsmTokenizer.from_pretrained(model_path)
	model = EsmForMaskedLM.from_pretrained(model_path)

	#################### Example ####################
	device = "cuda"
	model.to(device)

	seq = "MdEvVpQpLrVyQdYaKv"
	tokens = tokenizer.tokenize(seq)
	print(tokens)

	inputs = tokenizer(seq, return_tensors="pt")
	inputs = {k: v.to(device) for k, v in inputs.items()}

	outputs = model(**inputs)
	print(outputs.logits.shape)

	"""
	['Md', 'Ev', 'Vp', 'Qp', 'Lr', 'Vy', 'Qd', 'Ya', 'Kv']
	torch.Size([1, 11, 446])
	"""
	```

	### esm model
	The esm version is also stored in the same folder, named `SaProt_650M_AF2.pt`. We provide a function to load the model.
	```
	from utils.esm_loader import load_esm_saprot

	model_path = "/your/path/to/SaProt_650M_PDB.pt"
	model, alphabet = load_esm_saprot(model_path)
	```