feather-runtime / overlay /kernels /cuda /hash_kernel.cu
Jackoatmon's picture
Update Feather h200 training runtime image
e317e25 verified
raw
history blame contribute delete
422 Bytes
/*
* Engram CUDA hash kernel for O(1) N-gram context lookup.
*
* Phase 2: Custom CUDA kernel for batched hash computation.
* Phase 1: Uses Python-level hashing in EngramModule._hash_context().
*
* Hash function: h = token[t] ^ (token[t-1] * prime_1) ^ (token[t-2] * prime_2)
* Output: h % n_columns (table index)
*
* This kernel parallelizes over (batch, sequence) dimensions.
*/
// Stub: Phase 2 implementation