File size: 1,042 Bytes
ae481b9
 
 
 
 
 
 
 
 
 
 
 
 
6a5aa7c
ae481b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
datasets:
- HuggingFaceFW/fineweb-edu
base_model:
- GSAI-ML/LLaDA-8B-Base
tags:
- XDLM
- LLaDA
---

# [LLaDA-XDLM-8B-Base](https://arxiv.org/pdf/2602.01362)

This repository contains the checkpoint of 600 training steps for ***continual pretraining LLaDA with XDLM***.

***LLaDA-XDLM with sampling budget of 32.***
Evaluation of adapting LLaDA-8B to our XDLM formulation (LLaDA-XDLM): (a) LLaDA-XDLM consistently out-performs baselines across diverse benchmarks with 32 sampling steps; (b) Improvements are particularly pronounced in code generation (MBPP), where the
model substantially reduces generation failures.

<div align=center>
<img src="https://cdn-uploads.huggingface.co/production/uploads/65aa76b1cb5b4fb08ecb087c/oPbIv32EgvA1BbCqd2r6E.png" width="80%">
</div>


For details and usage see [Code](https://github.com/MzeroMiko/LLaDA-XDLM)

## TODO:
  - [ ] update `model_card` to support standard huggingface transformers's usage.

<!-- ## Updates -->