Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / transformers.js /pr_1665 /en /api /generation /streamers.md

HuggingFaceDocBuilder

22 days ago

preview code

download

raw

3.7 kB

	# generation/streamers

	Streamers for surfacing generated tokens as they are produced.

	Pass a `TextStreamer` (or `WhisperTextStreamer` for audio transcription) via
	the `streamer` argument of `generate()` to receive decoded text as tokens
	are emitted — useful for chat UIs and incremental transcription.

	## Classes

	### BaseStreamer

	Abstract base class for output streamers.

	#### `BaseStreamer.put(value)`

	Function that is called by `.generate()` to push new tokens

	Parameters

	- `value` (`bigint[][]`)

	#### `BaseStreamer.end()`

	Function that is called by `.generate()` to signal the end of generation

	### TextStreamer

	Simple text streamer that prints the token(s) to stdout as soon as entire words are formed.

	#### `TextStreamer.constructor(tokenizer, options)`

	Parameters

	- `tokenizer` ([`PreTrainedTokenizer`](../tokenizers#module_tokenizers.PreTrainedTokenizer))
	- `options` (`Object`)
	- `skip_prompt` (`boolean`) _optional_ — defaults to `false` — Whether to skip the prompt tokens
	- `skip_special_tokens` (`boolean`) _optional_ — defaults to `true` — Whether to skip special tokens when decoding
	- `callback_function` (`function(string): void`) _optional_ — defaults to `null` — Function to call when a piece of text is ready to display
	- `token_callback_function` (`function(bigint[]): void`) _optional_ — defaults to `null` — Function to call when a new token is generated
	- `decode_kwargs` (`Object`) _optional_ — defaults to `{}` — Additional keyword arguments to pass to the tokenizer's decode method

	#### `TextStreamer.put(value)`

	Receives tokens, decodes them, and prints them to stdout as soon as they form entire words.

	Parameters

	- `value` (`bigint[][]`)

	#### `TextStreamer.end()`

	Flushes any remaining cache and prints a newline to stdout.

	#### `TextStreamer.on_finalized_text(text, stream_end)`

	Prints the new text to stdout. If the stream is ending, also prints a newline.

	Parameters

	- `text` (`string`)
	- `stream_end` (`boolean`)

	### WhisperTextStreamer

	Utility class to handle streaming of tokens generated by whisper speech-to-text models.
	Callback functions are invoked when each of the following events occur:
	- A new chunk starts (on_chunk_start)
	- A new token is generated (callback_function)
	- A chunk ends (on_chunk_end)
	- The stream is finalized (on_finalize)

	#### `WhisperTextStreamer.constructor(tokenizer, options)`

	Parameters

	- `tokenizer` (`WhisperTokenizer`)
	- `options` (`Object`)
	- `skip_prompt` (`boolean`) _optional_ — defaults to `false` — Whether to skip the prompt tokens
	- `callback_function` (`function(string): void`) _optional_ — defaults to `null` — Function to call when a piece of text is ready to display
	- `token_callback_function` (`function(bigint[]): void`) _optional_ — defaults to `null` — Function to call when a new token is generated
	- `on_chunk_start` (`function(number): void`) _optional_ — defaults to `null` — Function to call when a new chunk starts
	- `on_chunk_end` (`function(number): void`) _optional_ — defaults to `null` — Function to call when a chunk ends
	- `on_finalize` (`function(): void`) _optional_ — defaults to `null` — Function to call when the stream is finalized
	- `time_precision` (`number`) _optional_ — defaults to `0.02` — Precision of the timestamps
	- `skip_special_tokens` (`boolean`) _optional_ — defaults to `true` — Whether to skip special tokens when decoding
	- `decode_kwargs` (`Object`) _optional_ — defaults to `{}` — Additional keyword arguments to pass to the tokenizer's decode method

	#### `WhisperTextStreamer.put(value)`

	Parameters

	- `value` (`bigint[][]`)

Xet Storage Details

Size:: 3.7 kB
Xet hash:: 0d253cd92dd6f76a8efadd34692d777850a9f4cc4a3ed40debb1a1817fdfb12d

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.