Mzero17 commited on
Commit
ae481b9
·
verified ·
1 Parent(s): 2940523

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -3
README.md CHANGED
@@ -1,3 +1,32 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ datasets:
6
+ - HuggingFaceFW/fineweb-edu
7
+ base_model:
8
+ - GSAI-ML/LLaDA-8B-Base
9
+ tags:
10
+ - XDLM
11
+ - LLaDA
12
+ ---
13
+
14
+ # LLaDA-XDLM-8B-Base
15
+
16
+ This repository contains the checkpoint of 600 training steps for ***continual pretraining LLaDA with XDLM***.
17
+
18
+ ***LLaDA-XDLM with sampling budget of 32.***
19
+ Evaluation of adapting LLaDA-8B to our XDLM formulation (LLaDA-XDLM): (a) LLaDA-XDLM consistently out-performs baselines across diverse benchmarks with 32 sampling steps; (b) Improvements are particularly pronounced in code generation (MBPP), where the
20
+ model substantially reduces generation failures.
21
+
22
+ <div align=center>
23
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65aa76b1cb5b4fb08ecb087c/oPbIv32EgvA1BbCqd2r6E.png" width="80%">
24
+ </div>
25
+
26
+
27
+ For details and usage see [Code](https://github.com/MzeroMiko/LLaDA-XDLM)
28
+
29
+ ## TODO:
30
+ - [ ] update `model_card` to support standard huggingface transformers's usage.
31
+
32
+ <!-- ## Updates -->