Text Generation
dllm
diffusion
rwkv
llm

Add pipeline tag and link to paper/code

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +30 -9
README.md CHANGED
@@ -1,4 +1,9 @@
1
  ---
 
 
 
 
 
2
  language:
3
  - en
4
  - zh
@@ -13,23 +18,39 @@ language:
13
  - vi
14
  - ar
15
  license: apache-2.0
 
16
  tags:
17
  - dllm
18
  - diffusion
19
  - rwkv
20
  - llm
21
- - text_generation
22
- datasets:
23
- - allenai/tulu-3-sft-mixture
24
- - Jackrong/GLM-5.1-Reasoning-1M-Cleaned
25
- - angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k
26
- base_model: BlinkDL/rwkv7-g1
27
  ---
28
 
29
  # Triplet-Block Diffusion RWKV
30
 
31
- This repository contains the checkpoint of B3D-RWKV, a 7B-parameter RWKV language model trained with the Triplet-Block Diffusion method.
 
 
 
 
 
 
 
 
 
 
 
32
 
33
- For usage, see the B3D-RWKV [infer](https://github.com/leonardodalinky/B3D-RWKV/tree/main/infer) and [serve](https://github.com/leonardodalinky/B3D-RWKV/tree/main/infer/serve) for instructions on how to run inference and serve the model.
34
 
35
- Note: This checkpoint is a SFT version of `rwkv7-g1f-7.2B`.
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: BlinkDL/rwkv7-g1
3
+ datasets:
4
+ - allenai/tulu-3-sft-mixture
5
+ - Jackrong/GLM-5.1-Reasoning-1M-Cleaned
6
+ - angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k
7
  language:
8
  - en
9
  - zh
 
18
  - vi
19
  - ar
20
  license: apache-2.0
21
+ pipeline_tag: text-generation
22
  tags:
23
  - dllm
24
  - diffusion
25
  - rwkv
26
  - llm
 
 
 
 
 
 
27
  ---
28
 
29
  # Triplet-Block Diffusion RWKV
30
 
31
+ This repository contains the checkpoint of **B3D-RWKV**, a 7.2B-parameter RWKV language model presented in the paper [Triplet-Block Diffusion RWKV](https://arxiv.org/abs/2605.25969).
32
+
33
+ B3D-RWKV is a diffusion RWKV variant that integrates the model's $O(L)$ inference efficiency with parallel, bidirectional discrete-diffusion through a *triplet-block layout* method. It reaches comparable accuracy on an 8-task suite versus existing models while significantly outperforming baselines in decoding throughput with an average of **1.6×** speedup.
34
+
35
+ - **Paper:** [Triplet-Block Diffusion RWKV](https://arxiv.org/abs/2605.25969)
36
+ - **Code:** [GitHub Repository](https://github.com/leonardodalinky/B3D-RWKV)
37
+
38
+ ## Usage
39
+
40
+ For usage, please see the B3D-RWKV [infer](https://github.com/leonardodalinky/B3D-RWKV/tree/main/infer) and [serve](https://github.com/leonardodalinky/B3D-RWKV/tree/main/infer/serve) directories in the official repository for instructions on how to run inference and serve the model.
41
+
42
+ Note: This checkpoint is a supervised fine-tuned (SFT) version of `rwkv7-g1f-7.2B`.
43
 
44
+ ## Citation
45
 
46
+ ```bibtex
47
+ @misc{lin2026tripletblockdiffusionrwkv,
48
+ title={Triplet-Block Diffusion RWKV},
49
+ author={Ke Lin and Yiyang Luo and Zhaolong Su and Yunya Song and Anyi Rao},
50
+ year={2026},
51
+ eprint={2605.25969},
52
+ archivePrefix={arXiv},
53
+ primaryClass={cs.CL},
54
+ url={https://arxiv.org/abs/2605.25969},
55
+ }
56
+ ```