Buckets:

rtrm's picture
|
download
raw
16.6 kB
# utils/data-structures
Custom data structures.
These are only used internally, meaning an end-user shouldn't
need to access anything here.
* [utils/data-structures](#module_utils/data-structures)
* _static_
* [.PriorityQueue](#module_utils/data-structures.PriorityQueue)
* [`new PriorityQueue(comparator)`](#new_module_utils/data-structures.PriorityQueue_new)
* [`.size`](#module_utils/data-structures.PriorityQueue+size)
* [`.isEmpty()`](#module_utils/data-structures.PriorityQueue+isEmpty) ⇒ boolean
* [`.peek()`](#module_utils/data-structures.PriorityQueue+peek) ⇒ any
* [`.push(...values)`](#module_utils/data-structures.PriorityQueue+push) ⇒ number
* [`.extend(values)`](#module_utils/data-structures.PriorityQueue+extend) ⇒ number
* [`.pop()`](#module_utils/data-structures.PriorityQueue+pop) ⇒ any
* [`.replace(value)`](#module_utils/data-structures.PriorityQueue+replace) ⇒ *
* [`._siftUpFrom(node)`](#module_utils/data-structures.PriorityQueue+_siftUpFrom)
* [.CharTrie](#module_utils/data-structures.CharTrie)
* [`.extend(texts)`](#module_utils/data-structures.CharTrie+extend)
* [`.push(text)`](#module_utils/data-structures.CharTrie+push)
* [`.commonPrefixSearch(text)`](#module_utils/data-structures.CharTrie+commonPrefixSearch)
* [.TokenLattice](#module_utils/data-structures.TokenLattice)
* [`new TokenLattice(sentence, bosTokenId, eosTokenId)`](#new_module_utils/data-structures.TokenLattice_new)
* [`.insert(pos, length, score, tokenId)`](#module_utils/data-structures.TokenLattice+insert)
* [`.viterbi()`](#module_utils/data-structures.TokenLattice+viterbi) ⇒ Array.<TokenLatticeNode>
* [`.piece(node)`](#module_utils/data-structures.TokenLattice+piece) ⇒ string
* [`.tokens()`](#module_utils/data-structures.TokenLattice+tokens) ⇒ Array.<string>
* [`.tokenIds()`](#module_utils/data-structures.TokenLattice+tokenIds) ⇒ Array.<number>
* [.DictionarySplitter](#module_utils/data-structures.DictionarySplitter)
* [`new DictionarySplitter(dictionary)`](#new_module_utils/data-structures.DictionarySplitter_new)
* [`.split(text)`](#module_utils/data-structures.DictionarySplitter+split) ⇒ Array.<string>
* [.LRUCache](#module_utils/data-structures.LRUCache)
* [`new LRUCache(capacity)`](#new_module_utils/data-structures.LRUCache_new)
* [`.get(key)`](#module_utils/data-structures.LRUCache+get) ⇒ any
* [`.put(key, value)`](#module_utils/data-structures.LRUCache+put)
* [`.clear()`](#module_utils/data-structures.LRUCache+clear)
* _inner_
* [~CharTrieNode](#module_utils/data-structures..CharTrieNode)
* [`new CharTrieNode(isLeaf, children)`](#new_module_utils/data-structures..CharTrieNode_new)
* [`.default()`](#module_utils/data-structures..CharTrieNode.default) ⇒ CharTrieNode
* [~TokenLatticeNode](#module_utils/data-structures..TokenLatticeNode)
* [`new TokenLatticeNode(tokenId, nodeId, pos, length, score)`](#new_module_utils/data-structures..TokenLatticeNode_new)
* [`.clone()`](#module_utils/data-structures..TokenLatticeNode+clone) ⇒ TokenLatticeNode
* * *
## utils/data-structures.PriorityQueue
Efficient Heap-based Implementation of a Priority Queue.
It uses an array-based binary heap, where the root is at index `0`, and the
children of node `i` are located at indices `2i + 1` and `2i + 2`, respectively.
Adapted from the following sources:
- https://stackoverflow.com/a/42919752/13989043 (original)
- https://github.com/belladoreai/llama-tokenizer-js (minor improvements)
**Kind**: static class of [utils/data-structures](#module_utils/data-structures)
* [.PriorityQueue](#module_utils/data-structures.PriorityQueue)
* [`new PriorityQueue(comparator)`](#new_module_utils/data-structures.PriorityQueue_new)
* [`.size`](#module_utils/data-structures.PriorityQueue+size)
* [`.isEmpty()`](#module_utils/data-structures.PriorityQueue+isEmpty) ⇒ boolean
* [`.peek()`](#module_utils/data-structures.PriorityQueue+peek) ⇒ any
* [`.push(...values)`](#module_utils/data-structures.PriorityQueue+push) ⇒ number
* [`.extend(values)`](#module_utils/data-structures.PriorityQueue+extend) ⇒ number
* [`.pop()`](#module_utils/data-structures.PriorityQueue+pop) ⇒ any
* [`.replace(value)`](#module_utils/data-structures.PriorityQueue+replace) ⇒ *
* [`._siftUpFrom(node)`](#module_utils/data-structures.PriorityQueue+_siftUpFrom)
* * *
### `new PriorityQueue(comparator)`
Create a new PriorityQueue.
ParamTypeDescription
comparatorfunctionComparator function to determine priority. Defaults to a MaxHeap.
* * *
### `priorityQueue.size`
The size of the queue
**Kind**: instance property of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
* * *
### `priorityQueue.isEmpty()` ⇒ boolean
Check if the queue is empty.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
**Returns**: boolean - `true` if the queue is empty, `false` otherwise.
* * *
### `priorityQueue.peek()` ⇒ any
Return the element with the highest priority in the queue.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
**Returns**: any - The highest priority element in the queue.
* * *
### `priorityQueue.push(...values)` ⇒ number
Add one or more elements to the queue.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
**Returns**: number - The new size of the queue.
ParamTypeDescription
...valuesanyThe values to push into the queue.
* * *
### `priorityQueue.extend(values)` ⇒ number
Add multiple elements to the queue.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
**Returns**: number - The new size of the queue.
ParamTypeDescription
valuesArray.<any>The values to push into the queue.
* * *
### `priorityQueue.pop()` ⇒ any
Remove and return the element with the highest priority in the queue.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
**Returns**: any - The element with the highest priority in the queue.
* * *
### `priorityQueue.replace(value)` ⇒ *
Replace the element with the highest priority in the queue with a new value.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
**Returns**: * - The replaced value.
ParamTypeDescription
value*The new value.
* * *
### `priorityQueue._siftUpFrom(node)`
Helper function to sift up from a given node.
**Kind**: instance method of [PriorityQueue](#module_utils/data-structures.PriorityQueue)
ParamTypeDescription
nodenumberThe index of the node to start sifting up from.
* * *
## utils/data-structures.CharTrie
A trie structure to efficiently store and search for strings.
**Kind**: static class of [utils/data-structures](#module_utils/data-structures)
* [.CharTrie](#module_utils/data-structures.CharTrie)
* [`.extend(texts)`](#module_utils/data-structures.CharTrie+extend)
* [`.push(text)`](#module_utils/data-structures.CharTrie+push)
* [`.commonPrefixSearch(text)`](#module_utils/data-structures.CharTrie+commonPrefixSearch)
* * *
### `charTrie.extend(texts)`
Adds one or more `texts` to the trie.
**Kind**: instance method of [CharTrie](#module_utils/data-structures.CharTrie)
ParamTypeDescription
textsArray.<string>The strings to add to the trie.
* * *
### `charTrie.push(text)`
Adds text to the trie.
**Kind**: instance method of [CharTrie](#module_utils/data-structures.CharTrie)
ParamTypeDescription
textstringThe string to add to the trie.
* * *
### `charTrie.commonPrefixSearch(text)`
Searches the trie for all strings with a common prefix of `text`.
**Kind**: instance method of [CharTrie](#module_utils/data-structures.CharTrie)
ParamTypeDescription
textstringThe common prefix to search for.
* * *
## utils/data-structures.TokenLattice
A lattice data structure to be used for tokenization.
**Kind**: static class of [utils/data-structures](#module_utils/data-structures)
* [.TokenLattice](#module_utils/data-structures.TokenLattice)
* [`new TokenLattice(sentence, bosTokenId, eosTokenId)`](#new_module_utils/data-structures.TokenLattice_new)
* [`.insert(pos, length, score, tokenId)`](#module_utils/data-structures.TokenLattice+insert)
* [`.viterbi()`](#module_utils/data-structures.TokenLattice+viterbi) ⇒ Array.<TokenLatticeNode>
* [`.piece(node)`](#module_utils/data-structures.TokenLattice+piece) ⇒ string
* [`.tokens()`](#module_utils/data-structures.TokenLattice+tokens) ⇒ Array.<string>
* [`.tokenIds()`](#module_utils/data-structures.TokenLattice+tokenIds) ⇒ Array.<number>
* * *
### `new TokenLattice(sentence, bosTokenId, eosTokenId)`
Creates a new TokenLattice instance.
ParamTypeDescription
sentencestringThe input sentence to be tokenized.
bosTokenIdnumberThe beginning-of-sequence token ID.
eosTokenIdnumberThe end-of-sequence token ID.
* * *
### `tokenLattice.insert(pos, length, score, tokenId)`
Inserts a new token node into the token lattice.
**Kind**: instance method of [TokenLattice](#module_utils/data-structures.TokenLattice)
ParamTypeDescription
posnumberThe starting position of the token.
lengthnumberThe length of the token.
scorenumberThe score of the token.
tokenIdnumberThe token ID of the token.
* * *
### `tokenLattice.viterbi()` ⇒ Array.<TokenLatticeNode>
Implements the Viterbi algorithm to compute the most likely sequence of tokens.
**Kind**: instance method of [TokenLattice](#module_utils/data-structures.TokenLattice)
**Returns**: Array.<TokenLatticeNode> - The most likely sequence of tokens.
* * *
### `tokenLattice.piece(node)` ⇒ string
**Kind**: instance method of [TokenLattice](#module_utils/data-structures.TokenLattice)
**Returns**: string - The array of nodes representing the most likely sequence of tokens.
ParamType
nodeTokenLatticeNode
* * *
### `tokenLattice.tokens()` ⇒ Array.<string>
**Kind**: instance method of [TokenLattice](#module_utils/data-structures.TokenLattice)
**Returns**: Array.<string> - The most likely sequence of tokens.
* * *
### `tokenLattice.tokenIds()` ⇒ Array.<number>
**Kind**: instance method of [TokenLattice](#module_utils/data-structures.TokenLattice)
**Returns**: Array.<number> - The most likely sequence of token ids.
* * *
## utils/data-structures.DictionarySplitter
A data structure which uses a trie to split a string into tokens based on a dictionary.
It can also use a regular expression to preprocess the input text before splitting.
NOTE: To ensure multi-byte characters are handled correctly, we operate at byte-level instead of character-level.
**Kind**: static class of [utils/data-structures](#module_utils/data-structures)
* [.DictionarySplitter](#module_utils/data-structures.DictionarySplitter)
* [`new DictionarySplitter(dictionary)`](#new_module_utils/data-structures.DictionarySplitter_new)
* [`.split(text)`](#module_utils/data-structures.DictionarySplitter+split) ⇒ Array.<string>
* * *
### `new DictionarySplitter(dictionary)`
ParamTypeDescription
dictionaryArray.<string>The dictionary of words to use for splitting.
* * *
### `dictionarySplitter.split(text)` ⇒ Array.<string>
Splits the input text into tokens based on the dictionary.
**Kind**: instance method of [DictionarySplitter](#module_utils/data-structures.DictionarySplitter)
**Returns**: Array.<string> - An array of tokens.
ParamTypeDescription
textstringThe input text to split.
* * *
## utils/data-structures.LRUCache
A simple Least Recently Used (LRU) cache implementation in JavaScript.
This cache stores key-value pairs and evicts the least recently used item
when the capacity is exceeded.
**Kind**: static class of [utils/data-structures](#module_utils/data-structures)
* [.LRUCache](#module_utils/data-structures.LRUCache)
* [`new LRUCache(capacity)`](#new_module_utils/data-structures.LRUCache_new)
* [`.get(key)`](#module_utils/data-structures.LRUCache+get) ⇒ any
* [`.put(key, value)`](#module_utils/data-structures.LRUCache+put)
* [`.clear()`](#module_utils/data-structures.LRUCache+clear)
* * *
### `new LRUCache(capacity)`
Creates an LRUCache instance.
ParamTypeDescription
capacitynumberThe maximum number of items the cache can hold.
* * *
### `lruCache.get(key)` ⇒ any
Retrieves the value associated with the given key and marks the key as recently used.
**Kind**: instance method of [LRUCache](#module_utils/data-structures.LRUCache)
**Returns**: any - The value associated with the key, or undefined if the key does not exist.
ParamTypeDescription
keyanyThe key to retrieve.
* * *
### `lruCache.put(key, value)`
Inserts or updates the key-value pair in the cache.
If the key already exists, it is updated and marked as recently used.
If the cache exceeds its capacity, the least recently used item is evicted.
**Kind**: instance method of [LRUCache](#module_utils/data-structures.LRUCache)
ParamTypeDescription
keyanyThe key to add or update.
valueanyThe value to associate with the key.
* * *
### `lruCache.clear()`
Clears the cache.
**Kind**: instance method of [LRUCache](#module_utils/data-structures.LRUCache)
* * *
## utils/data-structures~CharTrieNode
Represents a node in a character trie.
**Kind**: inner class of [utils/data-structures](#module_utils/data-structures)
* [~CharTrieNode](#module_utils/data-structures..CharTrieNode)
* [`new CharTrieNode(isLeaf, children)`](#new_module_utils/data-structures..CharTrieNode_new)
* [`.default()`](#module_utils/data-structures..CharTrieNode.default) ⇒ CharTrieNode
* * *
### `new CharTrieNode(isLeaf, children)`
Create a new CharTrieNode.
ParamTypeDescription
isLeafbooleanWhether the node is a leaf node or not.
childrenMap.<string, CharTrieNode>A map containing the node's children, where the key is a character and the value is a CharTrieNode.
* * *
### `CharTrieNode.default()` ⇒ CharTrieNode
Returns a new `CharTrieNode` instance with default values.
**Kind**: static method of [CharTrieNode](#module_utils/data-structures..CharTrieNode)
**Returns**: CharTrieNode - A new `CharTrieNode` instance with `isLeaf` set to `false` and an empty `children` map.
* * *
## utils/data-structures~TokenLatticeNode
**Kind**: inner class of [utils/data-structures](#module_utils/data-structures)
* [~TokenLatticeNode](#module_utils/data-structures..TokenLatticeNode)
* [`new TokenLatticeNode(tokenId, nodeId, pos, length, score)`](#new_module_utils/data-structures..TokenLatticeNode_new)
* [`.clone()`](#module_utils/data-structures..TokenLatticeNode+clone) ⇒ TokenLatticeNode
* * *
### `new TokenLatticeNode(tokenId, nodeId, pos, length, score)`
Represents a node in a token lattice for a given sentence.
ParamTypeDescription
tokenIdnumberThe ID of the token associated with this node.
nodeIdnumberThe ID of this node.
posnumberThe starting position of the token in the sentence.
lengthnumberThe length of the token.
scorenumberThe score associated with the token.
* * *
### `tokenLatticeNode.clone()` ⇒ TokenLatticeNode
Returns a clone of this node.
**Kind**: instance method of [TokenLatticeNode](#module_utils/data-structures..TokenLatticeNode)
**Returns**: TokenLatticeNode - A clone of this node.
* * *

Xet Storage Details

Size:
16.6 kB
·
Xet hash:
2a1fa62dc6f4270b7c51fa48a3f5a9dda93f3a2bb8ceac613fd006869f8b8196

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.