nielsr HF Staff commited on
Commit
15652a2
·
verified ·
1 Parent(s): 2a8a8f2

Improve model card metadata and add paper link

Browse files

Hi! I'm Niels from the Hugging Face community team.

I've opened this PR to improve the discoverability of your model by adding `library_name` and `pipeline_tag` metadata. This enables features like the "Use in Transformers" button and ensures the model appears in relevant searches on the Hub.

I've also added a link to your paper [D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation](https://huggingface.co/papers/2603.01780) at the top of the model card.

Files changed (1) hide show
  1. README.md +11 -6
README.md CHANGED
@@ -1,14 +1,18 @@
1
  ---
2
  license: apache-2.0
 
 
3
  tags:
4
- - biology
5
- - genomics
6
- - dna
7
- - diffusion
8
  ---
9
 
10
  # D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation
11
 
 
 
12
  A masked diffusion language model for **unconditional mammalian DNA sequence generation**, built on the ESM encoder with Rotary Positional Embeddings.
13
 
14
  > **Initialization**: trained **from scratch** (random initialization) with masked diffusion objective on mammalian DNA.
@@ -54,7 +58,8 @@ with torch.no_grad():
54
  outputs = model.diffusion_generate(inputs=input_ids, generation_config=config)
55
 
56
  for i, seq in enumerate(outputs.sequences):
57
- print(f">{i}\n{tokenizer.decode(seq, skip_special_tokens=True).replace(' ', '')}")
 
58
  ```
59
 
60
  ## Generation Parameters
@@ -83,4 +88,4 @@ for i, seq in enumerate(outputs.sequences):
83
 
84
  ## License
85
 
86
- Apache 2.0
 
1
  ---
2
  license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
  tags:
6
+ - biology
7
+ - genomics
8
+ - dna
9
+ - diffusion
10
  ---
11
 
12
  # D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation
13
 
14
+ This repository contains the weights for **D3LM-scratch**, presented in the paper [D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation](https://huggingface.co/papers/2603.01780).
15
+
16
  A masked diffusion language model for **unconditional mammalian DNA sequence generation**, built on the ESM encoder with Rotary Positional Embeddings.
17
 
18
  > **Initialization**: trained **from scratch** (random initialization) with masked diffusion objective on mammalian DNA.
 
58
  outputs = model.diffusion_generate(inputs=input_ids, generation_config=config)
59
 
60
  for i, seq in enumerate(outputs.sequences):
61
+ print(f">{i}
62
+ {tokenizer.decode(seq, skip_special_tokens=True).replace(' ', '')}")
63
  ```
64
 
65
  ## Generation Parameters
 
88
 
89
  ## License
90
 
91
+ Apache 2.0