--- license: apache-2.0 pipeline_tag: feature-extraction library_name: transformers --- # DPLM DPLM (diffusion protein language model) is a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences. Specifically, DPLM exhibits impressive performance in protein sequence generation, motif scaffolding, inverse folding, and representation learning. For more detailed information about DPLM, please refer to our paper [Diffusion Language Models Are Versatile Protein Learners](https://arxiv.org/abs/2402.18567). Project Page: https://bytedance.github.io/dplm/dplm-2.1 This repository contains the DPLM model checkpoint of 150M parameters. Please refer to our [github repository](https://github.com/bytedance/dplm/tree/main) for code and usage. For example, you can load DPLM model as below: ``` from byprot.models.lm.dplm import DiffusionProteinLanguageModel model_name = "airkingbd/dplm_150m" dplm = DiffusionProteinLanguageModel.from_pretrained(model_name) ``` All DPLM checkpoints are available in the table below: | Model size | Num layers | Num parameters | |------------------------------|----|----------| | [dplm_3b](https://huggingface.co/airkingbd/dplm_3b) | 36 | 3B | | [dplm_650m](https://huggingface.co/airkingbd/dplm_650m) | 33 | 650M | | [dplm_150m](https://huggingface.co/airkingbd/dplm_150m) | 30 | 150M | For details regarding the design space of multimodal protein language models (MPLMs), please refer to our spotlight paper at ICML'25: [Elucidating the Design Space of Multimodal Protein Language Models](https://huggingface.co/papers/2504.11454) **News**: welcome to check our new work [DPLM-2: A Multimodal Diffusion Protein Language Model](https://huggingface.co/papers/2410.13782), a multimodal protein foundation model that extends DPLM to simultaneously model, understand, and generate both sequences and structures!