RecursiveModelIndex / README.md
Vchaudhari17's picture
Upload folder using huggingface_hub
7170cb8 verified
|
Raw
History Blame Contribute Delete
2.04 kB
---
license: mit
language:
- en
tags:
- learned-index
- rmi
- search
- indexing
- machine-learning
---
# Recursive Model Index (RMI) - Learned Indexing
Recursive Model Index (RMI) - ML-powered learned index structure that replaces binary search with neural networks for 20-100x faster key lookups.
## Model Details
### Model Type
- **Architecture**: Two-stage learned index
- **Stage 1**: Neural Network learning Cumulative Distribution Function (CDF)
- **Stage 2**: 100 expert linear regression models for refinement
- **Lookup**: Learned prediction + bounded binary search
### Model Sources
- **Generalized Implementation**: 2026
- **Original Paper**: Kraska et al. "The Case for Learned Index Structures" (SIGMOD 2018)
## Uses
### Direct Use
Fast key lookups on sorted arrays:
```python
from model import RMIIndex
rmi = RMIIndex.load("vchaudhari17/RecursiveModelIndex")
result = rmi.search(42)
print(f"Found at: {result.actual_position}")
```
### Downstream Use
- Database indexing systems
- Time-series databases
- Search engines
## Model Specifications
- **Key Types**: Numeric (int64, float64) and string
- **Keys Count**: 9,556 unique keys
- **Key Range**: 0-99,999
- **Model Size**: ~151KB
- **Search Time**: ~50-200 microseconds
- **vs Binary Search**: 20-100x faster
## Training Data
- **Format**: Sorted, deduplicated integer keys
- **Size**: 9,556 keys
- **Range**: 0-99,999
- **Distribution**: Uniform
## Technical Details
### Configuration
```json
{
"n_experts": 100,
"model_type": "nn",
"hidden_layer_sizes": [64, 32],
"max_iter": 300,
"random_state": 42
}
```
### Error Bounds
Automatically computed during training for bounded search.
## Limitations
- Static data only (no insertions/deletions)
- Requires pre-sorted keys
- Performance varies by distribution
- Single-machine inference
## Citation
```bibtex
@article{kraska2018case,
title={The Case for Learned Index Structures},
author={Kraska, Tim and others},
journal={SIGMOD},
year={2018}
}
```
## License
MIT License