| # Diffusion Text Generation | |
| This directory contains implementations for Diffusion LLMs (DLLMs) | |
| More Info: | |
| - https://github.com/ggml-org/llama.cpp/pull/14644 | |
| - https://github.com/ggml-org/llama.cpp/pull/14771 | |
| ## Parameters | |
| The diffusion CLI supports various parameters to control the generation process: | |
| ### Core Diffusion Parameters | |
| - `--diffusion-steps`: Number of diffusion steps (default: 256) | |
| - `--diffusion-algorithm`: Algorithm for token selection | |
| - `0`: ORIGIN - Token will be generated in a purely random order from https://arxiv.org/abs/2107.03006. | |
| - `1`: ENTROPY_BASED - Entropy-based selection | |
| - `2`: MARGIN_BASED - Margin-based selection | |
| - `3`: RANDOM - Random selection | |
| - `4`: CONFIDENCE_BASED - Confidence-based selection (default) | |
| - More documentation here https://github.com/DreamLM/Dream | |
| - `--diffusion-visual`: Enable live visualization during generation | |
| ### Scheduling Parameters | |
| Choose one of the following scheduling methods: | |
| **Timestep-based scheduling:** | |
| - `--diffusion-eps`: Epsilon value for timestep scheduling (e.g., 0.001) | |
| **Block-based scheduling:** | |
| - `--diffusion-block-length`: Block size for block-based scheduling (e.g., 32) | |
| ### Sampling Parameters | |
| - `--temp`: Temperature for sampling (0.0 = greedy/deterministic, higher = more random) | |
| - `--top-k`: Top-k filtering for sampling | |
| - `--top-p`: Top-p (nucleus) filtering for sampling | |
| - `--seed`: Random seed for reproducibility | |
| ### Model Parameters | |
| - `-m`: Path to the GGUF model file | |
| - `-p`: Input prompt text | |
| - `-ub`: Maximum sequence length (ubatch size) | |
| - `-c`: Context size | |
| - `-b`: Batch size | |
| ### Examples | |
| #### Dream architechture: | |
| ``` | |
| llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual | |
| ``` | |
| #### LLaDA architechture: | |
| ``` | |
| llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual | |
| ``` | |
| #### RND1 architecture: | |
| ``` | |
| llama-diffusion-cli -m RND1-Base-0910.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-algorithm 1 --diffusion-steps 256 --diffusion-visual --temp 0.5 --diffusion-eps 0.001 | |
| ``` | |