VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning
Paper • 2503.15438 • Published • 4
How to use AI4Protein/deep_base with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("feature-extraction", model="AI4Protein/deep_base") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("AI4Protein/deep_base")
model = AutoModelForMaskedLM.from_pretrained("AI4Protein/deep_base")# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("AI4Protein/deep_base")
model = AutoModelForMaskedLM.from_pretrained("AI4Protein/deep_base")This model is part of the VenusFactory platform, described in VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning. VenusFactory provides a unified platform for protein engineering data retrieval and language model fine-tuning, integrating many protein-related datasets and popular PLMs.
This specific model uses a masked language modeling objective for protein sequence feature extraction.
Code and further details are available at https://github.com/tyang816/VenusFactory.
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="AI4Protein/deep_base")