LSHBloom: Memory-efficient, Extreme-scale Document Deduplication Paper • 2411.04257 • Published Nov 6, 2024
AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine Paper • 2505.01435 • Published Apr 23, 2025
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 82