divelab
/

OPDLM-8B

Text Generation

diffusion-language-model

on-policy-distillation

Model card Files Files and versions

shubhamprshr commited on 6 days ago

Commit

fab2ee0

·

verified ·

1 Parent(s): bf0939c

Update README.md

Files changed (1) hide show

README.md +5 -7

README.md CHANGED Viewed

@@ -22,26 +22,24 @@ autoregressive language model (ARLM) into a diffusion language model via
 ## Highlights
 - **Converted, not pretrained from scratch:** built from a strong ARLM, reusing its prior.
-- **Training-efficient:** ~75M tokens of conversion vs. ~50B tokens for from-scratch DLM training (same base ARLM).
 - **Inference-efficient:** parallel token decoding via block diffusion.
 ## Model Details
-- **Developed by:** [FILL: DIVE Lab, Texas A&M University]
 - **Base model:** [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
 - **Model type:** Block diffusion language model (decoder-based)
-- **Block size:** [FILL: e.g. 4]
 - **Parameters:** ~8B
 - **Language:** English
 - **License:** MIT
 ## Training
 - **Method:** On-policy distillation from a frozen ARLM teacher into a block DLM student.
-- **Conversion budget:** [CONFIRM: ~75M tokens]
 - **Data:** [opdlm_train_data](https://huggingface.co/datasets/divelab/opdlm_train_data)
 ## Evaluation
-[CONFIRM all numbers — these are from our table for OPDLM-8B (non-thinking);
-fill the thinking variant separately if releasing it]
 | Benchmark   | Score |
 |-------------|-------|
@@ -56,7 +54,7 @@ fill the thinking variant separately if releasing it]
 | HumanEval   | 59.8  |
 | MBPP        | 48.7  |
-Decoding: [FILL: static one-token-per-step / dynamic sampling — state which these are]
 ## Citation
 ```bibtex

 ## Highlights
 - **Converted, not pretrained from scratch:** built from a strong ARLM, reusing its prior.
+- **Training-efficient:** ~0.066B tokens of conversion vs. ~50B tokens for from-scratch DLM training (same base ARLM).
 - **Inference-efficient:** parallel token decoding via block diffusion.
 ## Model Details
+- **Developed by:** DIVE Lab, Texas A&M University
 - **Base model:** [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
 - **Model type:** Block diffusion language model (decoder-based)
+- **Block size:** 4
 - **Parameters:** ~8B
 - **Language:** English
 - **License:** MIT
 ## Training
 - **Method:** On-policy distillation from a frozen ARLM teacher into a block DLM student.
+- **Conversion budget:** ~0.066B tokens
 - **Data:** [opdlm_train_data](https://huggingface.co/datasets/divelab/opdlm_train_data)
 ## Evaluation
 | Benchmark   | Score |
 |-------------|-------|
 | HumanEval   | 59.8  |
 | MBPP        | 48.7  |
+Decoding: static (one token per step)
 ## Citation
 ```bibtex