FasterDFlash
/

Hanrui

Model card Files Files and versions

Hanrui / sglang /docs /advanced_features /observability.md

Lekr0's picture

Add files using upload-large-folder tool

a227c91 verified 29 days ago

|

history blame contribute delete

1.55 kB

	# Observability

	## Production Metrics
	SGLang exposes the following metrics via Prometheus. You can enable them by adding `--enable-metrics` when launching the server.
	You can query them by:
	```
	curl http://localhost:30000/metrics
	```

	See [Production Metrics](../references/production_metrics.md) and [Production Request Tracing](../references/production_request_trace.md) for more details.

	## Logging

	By default, SGLang does not log any request contents. You can log them by using `--log-requests`.
	You can control the verbosity by using `--log-request-level`.
	See [Logging](server_arguments.md#logging) for more details.

	## Request Dump and Replay

	You can dump all requests and replay them later for benchmarking or other purposes.

	To start dumping, use the following command to send a request to a server:
	```
	python3 -m sglang.srt.managers.configure_logging --url http://localhost:30000 --dump-requests-folder /tmp/sglang_request_dump --dump-requests-threshold 100
	```
	The server will dump the requests into a pickle file for every 100 requests.

	To replay the request dump, use `scripts/playground/replay_request_dump.py`.

	## Crash Dump and Replay
	Sometimes the server might crash, and you may want to debug the cause of the crash.
	SGLang supports crash dumping, which will dump all requests from the 5 minutes before the crash, allowing you to replay the requests and debug the reason later.

	To enable crash dumping, use `--crash-dump-folder /tmp/crash_dump`.
	To replay the crash dump, use `scripts/playground/replay_request_dump.py`.