dplm_150m / README.md
nielsr's picture
nielsr HF Staff
Add library name, pipeline tag and license
99ecc4e verified
|
raw
history blame
1.92 kB
metadata
license: apache-2.0
pipeline_tag: feature-extraction
library_name: transformers

DPLM

DPLM (diffusion protein language model) is a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences. Specifically, DPLM exhibits impressive performance in protein sequence generation, motif scaffolding, inverse folding, and representation learning. For more detailed information about DPLM, please refer to our paper Diffusion Language Models Are Versatile Protein Learners.

Project Page: https://bytedance.github.io/dplm/dplm-2.1

This repository contains the DPLM model checkpoint of 150M parameters. Please refer to our github repository for code and usage. For example, you can load DPLM model as below:

from byprot.models.lm.dplm import DiffusionProteinLanguageModel
model_name = "airkingbd/dplm_150m"
dplm = DiffusionProteinLanguageModel.from_pretrained(model_name)

All DPLM checkpoints are available in the table below:

Model size Num layers Num parameters
dplm_3b 36 3B
dplm_650m 33 650M
dplm_150m 30 150M

For details regarding the design space of multimodal protein language models (MPLMs), please refer to our spotlight paper at ICML'25: Elucidating the Design Space of Multimodal Protein Language Models

News: welcome to check our new work DPLM-2: A Multimodal Diffusion Protein Language Model, a multimodal protein foundation model that extends DPLM to simultaneously model, understand, and generate both sequences and structures!