Lekr0's picture
Add files using upload-large-folder tool
a227c91 verified

R-Fork

R-Fork (Tensor Remote Fork) is a novel weight loading methodology that leverages efficient inter-node GPU-to-GPU data transfer path to load tensors from a running SGLang instance to a new instance with zero-copy. It can significantly optimize the SGLang instance boot-up time by reducing model weights loading from several minutes to mere seconds.

To learn more details about R-Fork, please check R-Fork blog

Usage

Argument Usage
load-format set to remote_instance to enable R-Fork.
remote-instance-weight-loader-backend nccl or transfer_engine, default value is nccl
remote-instance-weight-loader-seed-instance-ip IP address of the seed instance who will provide the model weight
remote-instance-weight-loader-seed-instance-service-port the port that the seed instance's HTTP server is listening on
remote-instance-weight-loader-send-weights-group-ports the list of available ports on the seed instance that will be used to build NCCL communication groups between seed and client instance. This argument is only needed by nccl backend.
remote-instance-weight-loader-start-seed-via-transfer-engine set to start seed service that supports TransferEngine as backend. It is needed for seed instances when using transfer_engine as backend.

NCCL as backend

seed instance:

python -m sglang.launch_server [args]

client instance:

python -m sglang.launch_server [args] \
  --load-format remote_instance \
  --remote-instance-weight-loader-seed-instance-ip [seed_instance_ip] \
  --remote-instance-weight-loader-seed-instance-service-port [seed_instance_service_port] \
  --remote-instance-weight-loader-send-weights-group-ports [send_weights_nccl_group_ports_list]  \
  --remote-instance-weight-loader-backend nccl

TransferEngine as backend

seed instance:

python -m sglang.launch_server [args] \
  --remote-instance-weight-loader-start-seed-via-transfer-engine
python -m sglang.launch_server [args] \
  --load-format remote_instance \
  --remote-instance-weight-loader-seed-instance-ip [seed_instance_ip] \
  --remote-instance-weight-loader-seed-instance-service-port [seed_instance_service_port] \
  --remote-instance-weight-loader-backend transfer_engine