Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / transformers.js /pr_1665 /en /api /generation /streamers.md

HuggingFaceDocBuilder

22 days ago

preview code

download

raw

3.7 kB

generation/streamers

Streamers for surfacing generated tokens as they are produced.

Pass a TextStreamer (or WhisperTextStreamer for audio transcription) via the streamer argument of generate() to receive decoded text as tokens are emitted — useful for chat UIs and incremental transcription.

Classes

BaseStreamer

Abstract base class for output streamers.

`BaseStreamer.put(value)`

Function that is called by .generate() to push new tokens

Parameters

value (bigint[][])

`BaseStreamer.end()`

Function that is called by .generate() to signal the end of generation

TextStreamer

Simple text streamer that prints the token(s) to stdout as soon as entire words are formed.

`TextStreamer.constructor(tokenizer, options)`

Parameters

tokenizer (PreTrainedTokenizer)
options (Object)
- skip_prompt (boolean) optional — defaults to false — Whether to skip the prompt tokens
- skip_special_tokens (boolean) optional — defaults to true — Whether to skip special tokens when decoding
- callback_function (function(string): void) optional — defaults to null — Function to call when a piece of text is ready to display
- token_callback_function (function(bigint[]): void) optional — defaults to null — Function to call when a new token is generated
- decode_kwargs (Object) optional — defaults to {} — Additional keyword arguments to pass to the tokenizer's decode method

`TextStreamer.put(value)`

Receives tokens, decodes them, and prints them to stdout as soon as they form entire words.

Parameters

value (bigint[][])

`TextStreamer.end()`

Flushes any remaining cache and prints a newline to stdout.

`TextStreamer.on_finalized_text(text, stream_end)`

Prints the new text to stdout. If the stream is ending, also prints a newline.

Parameters

text (string)
stream_end (boolean)

WhisperTextStreamer

Utility class to handle streaming of tokens generated by whisper speech-to-text models. Callback functions are invoked when each of the following events occur:

A new chunk starts (on_chunk_start)
A new token is generated (callback_function)
A chunk ends (on_chunk_end)
The stream is finalized (on_finalize)

`WhisperTextStreamer.constructor(tokenizer, options)`

Parameters

tokenizer (WhisperTokenizer)
options (Object)
- skip_prompt (boolean) optional — defaults to false — Whether to skip the prompt tokens
- callback_function (function(string): void) optional — defaults to null — Function to call when a piece of text is ready to display
- token_callback_function (function(bigint[]): void) optional — defaults to null — Function to call when a new token is generated
- on_chunk_start (function(number): void) optional — defaults to null — Function to call when a new chunk starts
- on_chunk_end (function(number): void) optional — defaults to null — Function to call when a chunk ends
- on_finalize (function(): void) optional — defaults to null — Function to call when the stream is finalized
- time_precision (number) optional — defaults to 0.02 — Precision of the timestamps
- skip_special_tokens (boolean) optional — defaults to true — Whether to skip special tokens when decoding
- decode_kwargs (Object) optional — defaults to {} — Additional keyword arguments to pass to the tokenizer's decode method

`WhisperTextStreamer.put(value)`

Parameters

value (bigint[][])

Xet Storage Details

Size:: 3.7 kB
Xet hash:: 0d253cd92dd6f76a8efadd34692d777850a9f4cc4a3ed40debb1a1817fdfb12d

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.