Buckets:
utils/data-structures
Custom data structures.
These are only used internally, meaning an end-user shouldn't need to access anything here.
- utils/data-structures
- static
- .PriorityQueue
new PriorityQueue(comparator).size.isEmpty()⇒ boolean.peek()⇒ any.push(...values)⇒ number.extend(values)⇒ number.pop()⇒ any.replace(value)⇒ *._siftUpFrom(node)
- .CharTrie
- .TokenLattice
new TokenLattice(sentence, bosTokenId, eosTokenId).insert(pos, length, score, tokenId).viterbi()⇒ Array.<TokenLatticeNode>.piece(node)⇒ string.tokens()⇒ Array.<string>.tokenIds()⇒ Array.<number>
- .DictionarySplitter
new DictionarySplitter(dictionary).split(text)⇒ Array.<string>
- .LRUCache
- .PriorityQueue
- inner
- ~CharTrieNode
new CharTrieNode(isLeaf, children).default()⇒ CharTrieNode
- ~TokenLatticeNode
new TokenLatticeNode(tokenId, nodeId, pos, length, score).clone()⇒ TokenLatticeNode
- ~CharTrieNode
- static
utils/data-structures.PriorityQueue
Efficient Heap-based Implementation of a Priority Queue.
It uses an array-based binary heap, where the root is at index 0, and the
children of node i are located at indices 2i + 1 and 2i + 2, respectively.
Adapted from the following sources:
- https://stackoverflow.com/a/42919752/13989043 (original)
- https://github.com/belladoreai/llama-tokenizer-js (minor improvements)
Kind: static class of utils/data-structures
- .PriorityQueue
new PriorityQueue(comparator).size.isEmpty()⇒ boolean.peek()⇒ any.push(...values)⇒ number.extend(values)⇒ number.pop()⇒ any.replace(value)⇒ *._siftUpFrom(node)
new PriorityQueue(comparator)
Create a new PriorityQueue.
ParamTypeDescription
comparatorfunctionComparator function to determine priority. Defaults to a MaxHeap.
priorityQueue.size
The size of the queue
Kind: instance property of PriorityQueue
priorityQueue.isEmpty() ⇒ boolean
Check if the queue is empty.
Kind: instance method of PriorityQueue
Returns: boolean - true if the queue is empty, false otherwise.
priorityQueue.peek() ⇒ any
Return the element with the highest priority in the queue.
Kind: instance method of PriorityQueue
Returns: any - The highest priority element in the queue.
priorityQueue.push(...values) ⇒ number
Add one or more elements to the queue.
Kind: instance method of PriorityQueue
Returns: number - The new size of the queue.
ParamTypeDescription
...valuesanyThe values to push into the queue.
priorityQueue.extend(values) ⇒ number
Add multiple elements to the queue.
Kind: instance method of PriorityQueue
Returns: number - The new size of the queue.
ParamTypeDescription
valuesArray.<any>The values to push into the queue.
priorityQueue.pop() ⇒ any
Remove and return the element with the highest priority in the queue.
Kind: instance method of PriorityQueue
Returns: any - The element with the highest priority in the queue.
priorityQueue.replace(value) ⇒ *
Replace the element with the highest priority in the queue with a new value.
Kind: instance method of PriorityQueue
Returns: * - The replaced value.
ParamTypeDescription
value*The new value.
priorityQueue._siftUpFrom(node)
Helper function to sift up from a given node.
Kind: instance method of PriorityQueue
ParamTypeDescription
nodenumberThe index of the node to start sifting up from.
utils/data-structures.CharTrie
A trie structure to efficiently store and search for strings.
Kind: static class of utils/data-structures
charTrie.extend(texts)
Adds one or more texts to the trie.
Kind: instance method of CharTrie
ParamTypeDescription
textsArray.<string>The strings to add to the trie.
charTrie.push(text)
Adds text to the trie.
Kind: instance method of CharTrie
ParamTypeDescription
textstringThe string to add to the trie.
charTrie.commonPrefixSearch(text)
Searches the trie for all strings with a common prefix of text.
Kind: instance method of CharTrie
ParamTypeDescription
textstringThe common prefix to search for.
utils/data-structures.TokenLattice
A lattice data structure to be used for tokenization.
Kind: static class of utils/data-structures
- .TokenLattice
new TokenLattice(sentence, bosTokenId, eosTokenId).insert(pos, length, score, tokenId).viterbi()⇒ Array.<TokenLatticeNode>.piece(node)⇒ string.tokens()⇒ Array.<string>.tokenIds()⇒ Array.<number>
new TokenLattice(sentence, bosTokenId, eosTokenId)
Creates a new TokenLattice instance.
ParamTypeDescription
sentencestringThe input sentence to be tokenized.
bosTokenIdnumberThe beginning-of-sequence token ID.
eosTokenIdnumberThe end-of-sequence token ID.
tokenLattice.insert(pos, length, score, tokenId)
Inserts a new token node into the token lattice.
Kind: instance method of TokenLattice
ParamTypeDescription
posnumberThe starting position of the token.
lengthnumberThe length of the token.
scorenumberThe score of the token.
tokenIdnumberThe token ID of the token.
tokenLattice.viterbi() ⇒ Array.<TokenLatticeNode>
Implements the Viterbi algorithm to compute the most likely sequence of tokens.
Kind: instance method of TokenLattice
Returns: Array.<TokenLatticeNode> - The most likely sequence of tokens.
tokenLattice.piece(node) ⇒ string
Kind: instance method of TokenLattice
Returns: string - The array of nodes representing the most likely sequence of tokens.
ParamType
nodeTokenLatticeNode
tokenLattice.tokens() ⇒ Array.<string>
Kind: instance method of TokenLattice
Returns: Array.<string> - The most likely sequence of tokens.
tokenLattice.tokenIds() ⇒ Array.<number>
Kind: instance method of TokenLattice
Returns: Array.<number> - The most likely sequence of token ids.
utils/data-structures.DictionarySplitter
A data structure which uses a trie to split a string into tokens based on a dictionary. It can also use a regular expression to preprocess the input text before splitting.
NOTE: To ensure multi-byte characters are handled correctly, we operate at byte-level instead of character-level.
Kind: static class of utils/data-structures
- .DictionarySplitter
new DictionarySplitter(dictionary).split(text)⇒ Array.<string>
new DictionarySplitter(dictionary)
ParamTypeDescription
dictionaryArray.<string>The dictionary of words to use for splitting.
dictionarySplitter.split(text) ⇒ Array.<string>
Splits the input text into tokens based on the dictionary.
Kind: instance method of DictionarySplitter
Returns: Array.<string> - An array of tokens.
ParamTypeDescription
textstringThe input text to split.
utils/data-structures.LRUCache
A simple Least Recently Used (LRU) cache implementation in JavaScript. This cache stores key-value pairs and evicts the least recently used item when the capacity is exceeded.
Kind: static class of utils/data-structures
new LRUCache(capacity)
Creates an LRUCache instance.
ParamTypeDescription
capacitynumberThe maximum number of items the cache can hold.
lruCache.get(key) ⇒ any
Retrieves the value associated with the given key and marks the key as recently used.
Kind: instance method of LRUCache
Returns: any - The value associated with the key, or undefined if the key does not exist.
ParamTypeDescription
keyanyThe key to retrieve.
lruCache.put(key, value)
Inserts or updates the key-value pair in the cache. If the key already exists, it is updated and marked as recently used. If the cache exceeds its capacity, the least recently used item is evicted.
Kind: instance method of LRUCache
ParamTypeDescription
keyanyThe key to add or update.
valueanyThe value to associate with the key.
lruCache.clear()
Clears the cache.
Kind: instance method of LRUCache
utils/data-structures~CharTrieNode
Represents a node in a character trie.
Kind: inner class of utils/data-structures
- ~CharTrieNode
new CharTrieNode(isLeaf, children).default()⇒ CharTrieNode
new CharTrieNode(isLeaf, children)
Create a new CharTrieNode.
ParamTypeDescription
isLeafbooleanWhether the node is a leaf node or not.
childrenMap.<string, CharTrieNode>A map containing the node's children, where the key is a character and the value is a CharTrieNode.
CharTrieNode.default() ⇒ CharTrieNode
Returns a new CharTrieNode instance with default values.
Kind: static method of CharTrieNode
Returns: CharTrieNode - A new CharTrieNode instance with isLeaf set to false and an empty children map.
utils/data-structures~TokenLatticeNode
Kind: inner class of utils/data-structures
- ~TokenLatticeNode
new TokenLatticeNode(tokenId, nodeId, pos, length, score).clone()⇒ TokenLatticeNode
new TokenLatticeNode(tokenId, nodeId, pos, length, score)
Represents a node in a token lattice for a given sentence.
ParamTypeDescription
tokenIdnumberThe ID of the token associated with this node.
nodeIdnumberThe ID of this node.
posnumberThe starting position of the token in the sentence.
lengthnumberThe length of the token.
scorenumberThe score associated with the token.
tokenLatticeNode.clone() ⇒ TokenLatticeNode
Returns a clone of this node.
Kind: instance method of TokenLatticeNode
Returns: TokenLatticeNode - A clone of this node.
Xet Storage Details
- Size:
- 16.6 kB
- Xet hash:
- 2a1fa62dc6f4270b7c51fa48a3f5a9dda93f3a2bb8ceac613fd006869f8b8196
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.