Buckets:

|
download
raw
3.7 kB

generation/streamers

Streamers for surfacing generated tokens as they are produced.

Pass a TextStreamer (or WhisperTextStreamer for audio transcription) via the streamer argument of generate() to receive decoded text as tokens are emitted — useful for chat UIs and incremental transcription.

Classes

BaseStreamer

Abstract base class for output streamers.

BaseStreamer.put(value)

Function that is called by .generate() to push new tokens

Parameters

  • value (bigint[][])

BaseStreamer.end()

Function that is called by .generate() to signal the end of generation

TextStreamer

Simple text streamer that prints the token(s) to stdout as soon as entire words are formed.

TextStreamer.constructor(tokenizer, options)

Parameters

  • tokenizer (PreTrainedTokenizer)
  • options (Object)
    • skip_prompt (boolean) optional — defaults to false — Whether to skip the prompt tokens
    • skip_special_tokens (boolean) optional — defaults to true — Whether to skip special tokens when decoding
    • callback_function (function(string): void) optional — defaults to null — Function to call when a piece of text is ready to display
    • token_callback_function (function(bigint[]): void) optional — defaults to null — Function to call when a new token is generated
    • decode_kwargs (Object) optional — defaults to {} — Additional keyword arguments to pass to the tokenizer's decode method

TextStreamer.put(value)

Receives tokens, decodes them, and prints them to stdout as soon as they form entire words.

Parameters

  • value (bigint[][])

TextStreamer.end()

Flushes any remaining cache and prints a newline to stdout.

TextStreamer.on_finalized_text(text, stream_end)

Prints the new text to stdout. If the stream is ending, also prints a newline.

Parameters

  • text (string)
  • stream_end (boolean)

WhisperTextStreamer

Utility class to handle streaming of tokens generated by whisper speech-to-text models. Callback functions are invoked when each of the following events occur:

  • A new chunk starts (on_chunk_start)
  • A new token is generated (callback_function)
  • A chunk ends (on_chunk_end)
  • The stream is finalized (on_finalize)

WhisperTextStreamer.constructor(tokenizer, options)

Parameters

  • tokenizer (WhisperTokenizer)
  • options (Object)
    • skip_prompt (boolean) optional — defaults to false — Whether to skip the prompt tokens
    • callback_function (function(string): void) optional — defaults to null — Function to call when a piece of text is ready to display
    • token_callback_function (function(bigint[]): void) optional — defaults to null — Function to call when a new token is generated
    • on_chunk_start (function(number): void) optional — defaults to null — Function to call when a new chunk starts
    • on_chunk_end (function(number): void) optional — defaults to null — Function to call when a chunk ends
    • on_finalize (function(): void) optional — defaults to null — Function to call when the stream is finalized
    • time_precision (number) optional — defaults to 0.02 — Precision of the timestamps
    • skip_special_tokens (boolean) optional — defaults to true — Whether to skip special tokens when decoding
    • decode_kwargs (Object) optional — defaults to {} — Additional keyword arguments to pass to the tokenizer's decode method

WhisperTextStreamer.put(value)

Parameters

  • value (bigint[][])

Xet Storage Details

Size:
3.7 kB
·
Xet hash:
0d253cd92dd6f76a8efadd34692d777850a9f4cc4a3ed40debb1a1817fdfb12d

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.