Recursive Model Index (RMI) - Learned Indexing
Recursive Model Index (RMI) - ML-powered learned index structure that replaces binary search with neural networks for 20-100x faster key lookups.
Model Details
Model Type
- Architecture: Two-stage learned index
- Stage 1: Neural Network learning Cumulative Distribution Function (CDF)
- Stage 2: 100 expert linear regression models for refinement
- Lookup: Learned prediction + bounded binary search
Model Sources
- Generalized Implementation: 2026
- Original Paper: Kraska et al. "The Case for Learned Index Structures" (SIGMOD 2018)
Uses
Direct Use
Fast key lookups on sorted arrays:
from model import RMIIndex
rmi = RMIIndex.load("vchaudhari17/RecursiveModelIndex")
result = rmi.search(42)
print(f"Found at: {result.actual_position}")
Downstream Use
- Database indexing systems
- Time-series databases
- Search engines
Model Specifications
- Key Types: Numeric (int64, float64) and string
- Keys Count: 9,556 unique keys
- Key Range: 0-99,999
- Model Size: ~151KB
- Search Time: ~50-200 microseconds
- vs Binary Search: 20-100x faster
Training Data
- Format: Sorted, deduplicated integer keys
- Size: 9,556 keys
- Range: 0-99,999
- Distribution: Uniform
Technical Details
Configuration
{
"n_experts": 100,
"model_type": "nn",
"hidden_layer_sizes": [64, 32],
"max_iter": 300,
"random_state": 42
}
Error Bounds
Automatically computed during training for bounded search.
Limitations
- Static data only (no insertions/deletions)
- Requires pre-sorted keys
- Performance varies by distribution
- Single-machine inference
Citation
@article{kraska2018case,
title={The Case for Learned Index Structures},
author={Kraska, Tim and others},
journal={SIGMOD},
year={2018}
}
License
MIT License
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support