Buckets:
| # generation/streamers | |
| Streamers for surfacing generated tokens as they are produced. | |
| Pass a `TextStreamer` (or `WhisperTextStreamer` for audio transcription) via | |
| the `streamer` argument of `generate()` to receive decoded text as tokens | |
| are emitted — useful for chat UIs and incremental transcription. | |
| ## Classes | |
| ### BaseStreamer | |
| Abstract base class for output streamers. | |
| #### `BaseStreamer.put(value)` | |
| Function that is called by `.generate()` to push new tokens | |
| **Parameters** | |
| - `value` (`bigint[][]`) | |
| #### `BaseStreamer.end()` | |
| Function that is called by `.generate()` to signal the end of generation | |
| ### TextStreamer | |
| Simple text streamer that prints the token(s) to stdout as soon as entire words are formed. | |
| #### `TextStreamer.constructor(tokenizer, options)` | |
| **Parameters** | |
| - `tokenizer` ([`PreTrainedTokenizer`](../tokenizers#module_tokenizers.PreTrainedTokenizer)) | |
| - `options` (`Object`) | |
| - `skip_prompt` (`boolean`) _optional_ — defaults to `false` — Whether to skip the prompt tokens | |
| - `skip_special_tokens` (`boolean`) _optional_ — defaults to `true` — Whether to skip special tokens when decoding | |
| - `callback_function` (`function(string): void`) _optional_ — defaults to `null` — Function to call when a piece of text is ready to display | |
| - `token_callback_function` (`function(bigint[]): void`) _optional_ — defaults to `null` — Function to call when a new token is generated | |
| - `decode_kwargs` (`Object`) _optional_ — defaults to `{}` — Additional keyword arguments to pass to the tokenizer's decode method | |
| #### `TextStreamer.put(value)` | |
| Receives tokens, decodes them, and prints them to stdout as soon as they form entire words. | |
| **Parameters** | |
| - `value` (`bigint[][]`) | |
| #### `TextStreamer.end()` | |
| Flushes any remaining cache and prints a newline to stdout. | |
| #### `TextStreamer.on_finalized_text(text, stream_end)` | |
| Prints the new text to stdout. If the stream is ending, also prints a newline. | |
| **Parameters** | |
| - `text` (`string`) | |
| - `stream_end` (`boolean`) | |
| ### WhisperTextStreamer | |
| Utility class to handle streaming of tokens generated by whisper speech-to-text models. | |
| Callback functions are invoked when each of the following events occur: | |
| - A new chunk starts (on_chunk_start) | |
| - A new token is generated (callback_function) | |
| - A chunk ends (on_chunk_end) | |
| - The stream is finalized (on_finalize) | |
| #### `WhisperTextStreamer.constructor(tokenizer, options)` | |
| **Parameters** | |
| - `tokenizer` (`WhisperTokenizer`) | |
| - `options` (`Object`) | |
| - `skip_prompt` (`boolean`) _optional_ — defaults to `false` — Whether to skip the prompt tokens | |
| - `callback_function` (`function(string): void`) _optional_ — defaults to `null` — Function to call when a piece of text is ready to display | |
| - `token_callback_function` (`function(bigint[]): void`) _optional_ — defaults to `null` — Function to call when a new token is generated | |
| - `on_chunk_start` (`function(number): void`) _optional_ — defaults to `null` — Function to call when a new chunk starts | |
| - `on_chunk_end` (`function(number): void`) _optional_ — defaults to `null` — Function to call when a chunk ends | |
| - `on_finalize` (`function(): void`) _optional_ — defaults to `null` — Function to call when the stream is finalized | |
| - `time_precision` (`number`) _optional_ — defaults to `0.02` — Precision of the timestamps | |
| - `skip_special_tokens` (`boolean`) _optional_ — defaults to `true` — Whether to skip special tokens when decoding | |
| - `decode_kwargs` (`Object`) _optional_ — defaults to `{}` — Additional keyword arguments to pass to the tokenizer's decode method | |
| #### `WhisperTextStreamer.put(value)` | |
| **Parameters** | |
| - `value` (`bigint[][]`) | |
Xet Storage Details
- Size:
- 3.7 kB
- Xet hash:
- 0d253cd92dd6f76a8efadd34692d777850a9f4cc4a3ed40debb1a1817fdfb12d
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.