File size: 3,813 Bytes
a227c91 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | # SGLang Performance Dashboard
A web-based dashboard for visualizing SGLang nightly test performance metrics.
## Features
- **Performance Trends**: View throughput, latency, and TTFT trends over time
- **Model Comparison**: Compare performance across different models and configurations
- **Filtering**: Filter by GPU configuration, model, variant, and batch size
- **Interactive Charts**: Zoom, pan, and hover for detailed metrics
- **Run History**: View recent benchmark runs with links to GitHub Actions
## Quick Start
### Option 1: Run with Local Server (Recommended)
For live data from GitHub Actions artifacts:
```bash
# Install requirements
pip install requests
# Run the server
python server.py --fetch-on-start
# Visit http://localhost:8000
```
The server provides:
- Automatic fetching of metrics from GitHub
- Caching to reduce API calls
- `/api/metrics` endpoint for the frontend
### Option 2: Fetch Data Manually
Use the fetch script to download metrics data:
```bash
# Fetch last 30 days of metrics
python fetch_metrics.py --output metrics_data.json
# Fetch a specific run
python fetch_metrics.py --run-id 21338741812 --output single_run.json
# Fetch only scheduled (nightly) runs
python fetch_metrics.py --scheduled-only --days 7
```
## GitHub Token
To download artifacts from GitHub, you need authentication:
1. **Using `gh` CLI** (recommended):
```bash
gh auth login
```
2. **Using environment variable**:
```bash
export GITHUB_TOKEN=your_token_here
```
Without a token, the dashboard will show run metadata but not detailed benchmark results.
## Data Structure
The metrics JSON has this structure:
```json
{
"run_id": "21338741812",
"run_date": "2026-01-25T22:24:02.090218+00:00",
"commit_sha": "5cdb391...",
"branch": "main",
"results": [
{
"gpu_config": "8-gpu-h200",
"partition": 0,
"model": "deepseek-ai/DeepSeek-V3.1",
"variant": "TP8+MTP",
"benchmarks": [
{
"batch_size": 1,
"input_len": 4096,
"output_len": 512,
"latency_ms": 2400.72,
"input_throughput": 21408.64,
"output_throughput": 231.74,
"overall_throughput": 1919.43,
"ttft_ms": 191.32,
"acc_length": 3.19
}
]
}
]
}
```
## Deployment
### GitHub Pages
The dashboard can be deployed to GitHub Pages for public access:
1. Copy the dashboard files to `docs/performance_dashboard/`
2. Enable GitHub Pages in repository settings
3. Set up a GitHub Action to periodically update metrics data
### Self-Hosted
For a self-hosted deployment with live data:
1. Set up a server running `server.py`
2. Configure a cron job or systemd timer to refresh data
3. Optionally put behind nginx/caddy for SSL
## Metrics Explained
- **Overall Throughput**: Total tokens (input + output) processed per second
- **Input Throughput**: Input tokens processed per second (prefill speed)
- **Output Throughput**: Output tokens generated per second (decode speed)
- **Latency**: End-to-end time to complete the request
- **TTFT**: Time to First Token - time until the first output token
- **Acc Length**: Acceptance length for speculative decoding (MTP variants)
## Contributing
To add support for new metrics or visualizations:
1. Update `fetch_metrics.py` if data collection needs changes
2. Modify `app.js` to add new chart types or filters
3. Update `index.html` for UI changes
## Troubleshooting
**No data displayed**
- Check browser console for errors
- Verify GitHub API is accessible
- Try running with `server.py --fetch-on-start`
**API rate limits**
- Use a GitHub token for higher limits
- The server caches data for 5 minutes
**Charts not rendering**
- Ensure Chart.js is loading from CDN
- Check for JavaScript errors in console
|