Buckets:

|
download
raw
3.7 kB
# generation/streamers
Streamers for surfacing generated tokens as they are produced.
Pass a `TextStreamer` (or `WhisperTextStreamer` for audio transcription) via
the `streamer` argument of `generate()` to receive decoded text as tokens
are emitted — useful for chat UIs and incremental transcription.
## Classes
### BaseStreamer
Abstract base class for output streamers.
#### `BaseStreamer.put(value)`
Function that is called by `.generate()` to push new tokens
**Parameters**
- `value` (`bigint[][]`)
#### `BaseStreamer.end()`
Function that is called by `.generate()` to signal the end of generation
### TextStreamer
Simple text streamer that prints the token(s) to stdout as soon as entire words are formed.
#### `TextStreamer.constructor(tokenizer, options)`
**Parameters**
- `tokenizer` ([`PreTrainedTokenizer`](../tokenizers#module_tokenizers.PreTrainedTokenizer))
- `options` (`Object`)
- `skip_prompt` (`boolean`) _optional_ — defaults to `false` — Whether to skip the prompt tokens
- `skip_special_tokens` (`boolean`) _optional_ — defaults to `true` — Whether to skip special tokens when decoding
- `callback_function` (`function(string): void`) _optional_ — defaults to `null` — Function to call when a piece of text is ready to display
- `token_callback_function` (`function(bigint[]): void`) _optional_ — defaults to `null` — Function to call when a new token is generated
- `decode_kwargs` (`Object`) _optional_ — defaults to `{}` — Additional keyword arguments to pass to the tokenizer's decode method
#### `TextStreamer.put(value)`
Receives tokens, decodes them, and prints them to stdout as soon as they form entire words.
**Parameters**
- `value` (`bigint[][]`)
#### `TextStreamer.end()`
Flushes any remaining cache and prints a newline to stdout.
#### `TextStreamer.on_finalized_text(text, stream_end)`
Prints the new text to stdout. If the stream is ending, also prints a newline.
**Parameters**
- `text` (`string`)
- `stream_end` (`boolean`)
### WhisperTextStreamer
Utility class to handle streaming of tokens generated by whisper speech-to-text models.
Callback functions are invoked when each of the following events occur:
- A new chunk starts (on_chunk_start)
- A new token is generated (callback_function)
- A chunk ends (on_chunk_end)
- The stream is finalized (on_finalize)
#### `WhisperTextStreamer.constructor(tokenizer, options)`
**Parameters**
- `tokenizer` (`WhisperTokenizer`)
- `options` (`Object`)
- `skip_prompt` (`boolean`) _optional_ — defaults to `false` — Whether to skip the prompt tokens
- `callback_function` (`function(string): void`) _optional_ — defaults to `null` — Function to call when a piece of text is ready to display
- `token_callback_function` (`function(bigint[]): void`) _optional_ — defaults to `null` — Function to call when a new token is generated
- `on_chunk_start` (`function(number): void`) _optional_ — defaults to `null` — Function to call when a new chunk starts
- `on_chunk_end` (`function(number): void`) _optional_ — defaults to `null` — Function to call when a chunk ends
- `on_finalize` (`function(): void`) _optional_ — defaults to `null` — Function to call when the stream is finalized
- `time_precision` (`number`) _optional_ — defaults to `0.02` — Precision of the timestamps
- `skip_special_tokens` (`boolean`) _optional_ — defaults to `true` — Whether to skip special tokens when decoding
- `decode_kwargs` (`Object`) _optional_ — defaults to `{}` — Additional keyword arguments to pass to the tokenizer's decode method
#### `WhisperTextStreamer.put(value)`
**Parameters**
- `value` (`bigint[][]`)

Xet Storage Details

Size:
3.7 kB
·
Xet hash:
0d253cd92dd6f76a8efadd34692d777850a9f4cc4a3ed40debb1a1817fdfb12d

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.