Text Generation
Transformers
Safetensors
llada2_moe
conversational
custom_code
Zigeng commited on
Commit
c82e547
Β·
verified Β·
1 Parent(s): e1e2401

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -33,7 +33,7 @@ base_model:
33
  - **Soft Parallel Decoding**: Uses interpolation between mask and token embeddings to propagate confidence priors from previous steps.
34
 
35
  <div align="center">
36
- <img src="assets/tradeoff.png" width="90%" />
37
  <br>
38
  <em>Superior Parallelism-Accuracy Trade-off, Increased TPF with Maintained Accuracy.</em>
39
  </div>
@@ -43,13 +43,13 @@ base_model:
43
 
44
  | Model | Description | Source Model | Link |
45
  | --- | --- | --- | --- |
46
- | πŸ€– DMax-Math-16B | Highly parallel dLLM for math and reasoning. | LLaDA-2.0-mini | [Hugging Face](https://huggingface.co/Zigeng/DMax-Math-16B) |
47
- | πŸ€– DMax-Coder-16B | Highly parallel dLLM for code generation. | LLaDA-2.0-mini | [Hugging Face](https://huggingface.co/Zigeng/DMax-Coder-16B) |
48
 
49
  | Dataset | Description | Link |
50
  | --- | --- | --- |
51
- | πŸ“Š DMax-Math-Training-Data | Trajectories on math problems generated by LLaDA-2.0-mini | [Hugging Face](https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Math-Trajectories) |
52
- | πŸ“Š DMax-Code-Training-Data | Trajectories on code problems generated by LLaDA-2.0-mini | [Hugging Face](https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories) |
53
 
54
 
55
  ## πŸš€ Quick Start
 
33
  - **Soft Parallel Decoding**: Uses interpolation between mask and token embeddings to propagate confidence priors from previous steps.
34
 
35
  <div align="center">
36
+ <img src="assets/tradeoff.png" width="100%" />
37
  <br>
38
  <em>Superior Parallelism-Accuracy Trade-off, Increased TPF with Maintained Accuracy.</em>
39
  </div>
 
43
 
44
  | Model | Description | Source Model | Link |
45
  | --- | --- | --- | --- |
46
+ | πŸ€– DMax-Math-16B | Highly parallel dLLM for math and reasoning. | LLaDA-2.0-mini | [HF](https://huggingface.co/Zigeng/DMax-Math-16B) |
47
+ | πŸ€– DMax-Coder-16B | Highly parallel dLLM for code generation. | LLaDA-2.0-mini | [HF](https://huggingface.co/Zigeng/DMax-Coder-16B) |
48
 
49
  | Dataset | Description | Link |
50
  | --- | --- | --- |
51
+ | πŸ“Š DMax-Math-Training-Data | Trajectories on math problems generated by LLaDA-2.0-mini | [HF](https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Math-Trajectories) |
52
+ | πŸ“Š DMax-Code-Training-Data | Trajectories on code problems generated by LLaDA-2.0-mini | [HF](https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories) |
53
 
54
 
55
  ## πŸš€ Quick Start