| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| README.md | 1.9 kB xet | de957cf1 | |
| aibrix_kvcache_storage.py | 5.57 kB xet | b3b66d17 | |
| unit_test.py | 3.42 kB xet | dd5cd8bc |
AIBrix KVCache as L3 KV Cache
This document provides brief instructions for setting up a AIBrixKVCache storage backend + AIBrixKVCache + SGLang runtime environment from scratch, describing how to utilize AIBrixKVCache as the L3 KV cache for SGLang. The process consists of three main steps:
Step1:Install AIbrix KVCache
Refer to the AIBrix KVCache documentation to install AIBrix KVCache.
Step2: Deploy AIBrix Distributed KVCache Storage
AIBrix KVCache currently supports multiple distributed KVCache backends, including ByteDance's open-source Infinistore and the not-yet-open source PrisKV incubated by ByteDance's PrisDB & IAAS & DMI team.
For the Infinistore installation process, please refer to this link.
PrisKV for AIBrix KVCache is currently in the open-source preparation stage, and no public documentation is available yet.
Step3: Deploy Model Serving
For information on configuring a distributed KVCache backend for AIBrixKVCache, please refer to this link
Using PrisKV as an example, the startup command is as follows:
export AIBRIX_KV_CACHE_OL_L1_CACHE_ENABLED="0"
export AIBRIX_KV_CACHE_OL_L2_CACHE_BACKEND="PRIS"
export AIBRIX_KV_CACHE_OL_PRIS_REMOTE_ADDR="127.0.0.1"
export AIBRIX_KV_CACHE_OL_PRIS_REMOTE_PORT="6379"
export AIBRIX_KV_CACHE_OL_PRIS_PASSWORD="kvcache-redis"
MODEL_LENGTH=32768&&NCCL_MIN_NCHANNELS=24&&NCCL_IB_QPS_PER_CONNECTION=8&&NCCL_DEBUG=INFO \
python3 -m sglang.launch_server \
--model-path /path/to/models/Qwen3-32B \
--host 0.0.0.0 --port 8080 \
--enable-hierarchical-cache \
--hicache-storage-backend aibrix \
--page-size 16 \
--hicache-write-policy write_back \
--enable-metrics --hicache-ratio=2
- Total size
- 466 GB
- Files
- 2,808
- Last updated
- Jun 16
- Pre-warmed CDN
- US EU US EU