CrashOverrideX
/

Quillan-Ronin

Text Generation

image-generation

video-generation

audio-generation

text-generation-inference

Model card Files Files and versions

Quillan-Ronin / llama.cpp /examples /diffusion /README.md

CrashOverrideX's picture

Add files using upload-large-folder tool

31dd200 verified 4 days ago

|

history blame contribute delete

2.29 kB

	# Diffusion Text Generation

	This directory contains implementations for Diffusion LLMs (DLLMs)

	More Info:
	- https://github.com/ggml-org/llama.cpp/pull/14644
	- https://github.com/ggml-org/llama.cpp/pull/14771

	## Parameters
	The diffusion CLI supports various parameters to control the generation process:

	### Core Diffusion Parameters
	- `--diffusion-steps`: Number of diffusion steps (default: 256)
	- `--diffusion-algorithm`: Algorithm for token selection
	- `0`: ORIGIN - Token will be generated in a purely random order from https://arxiv.org/abs/2107.03006.
	- `1`: ENTROPY_BASED - Entropy-based selection
	- `2`: MARGIN_BASED - Margin-based selection
	- `3`: RANDOM - Random selection
	- `4`: CONFIDENCE_BASED - Confidence-based selection (default)
	- More documentation here https://github.com/DreamLM/Dream
	- `--diffusion-visual`: Enable live visualization during generation

	### Scheduling Parameters
	Choose one of the following scheduling methods:

	Timestep-based scheduling:
	- `--diffusion-eps`: Epsilon value for timestep scheduling (e.g., 0.001)

	Block-based scheduling:
	- `--diffusion-block-length`: Block size for block-based scheduling (e.g., 32)

	### Sampling Parameters
	- `--temp`: Temperature for sampling (0.0 = greedy/deterministic, higher = more random)
	- `--top-k`: Top-k filtering for sampling
	- `--top-p`: Top-p (nucleus) filtering for sampling
	- `--seed`: Random seed for reproducibility

	### Model Parameters
	- `-m`: Path to the GGUF model file
	- `-p`: Input prompt text
	- `-ub`: Maximum sequence length (ubatch size)
	- `-c`: Context size
	- `-b`: Batch size

	### Examples
	#### Dream architechture:
	```
	llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual
	```

	#### LLaDA architechture:
	```
	llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual
	```

	#### RND1 architecture:
	```
	llama-diffusion-cli -m RND1-Base-0910.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-algorithm 1 --diffusion-steps 256 --diffusion-visual --temp 0.5 --diffusion-eps 0.001
	```