File size: 3,813 Bytes
a227c91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
# SGLang Performance Dashboard

A web-based dashboard for visualizing SGLang nightly test performance metrics.

## Features

- **Performance Trends**: View throughput, latency, and TTFT trends over time
- **Model Comparison**: Compare performance across different models and configurations
- **Filtering**: Filter by GPU configuration, model, variant, and batch size
- **Interactive Charts**: Zoom, pan, and hover for detailed metrics
- **Run History**: View recent benchmark runs with links to GitHub Actions

## Quick Start

### Option 1: Run with Local Server (Recommended)

For live data from GitHub Actions artifacts:

```bash
# Install requirements
pip install requests

# Run the server
python server.py --fetch-on-start

# Visit http://localhost:8000
```

The server provides:
- Automatic fetching of metrics from GitHub
- Caching to reduce API calls
- `/api/metrics` endpoint for the frontend

### Option 2: Fetch Data Manually

Use the fetch script to download metrics data:

```bash
# Fetch last 30 days of metrics
python fetch_metrics.py --output metrics_data.json

# Fetch a specific run
python fetch_metrics.py --run-id 21338741812 --output single_run.json

# Fetch only scheduled (nightly) runs
python fetch_metrics.py --scheduled-only --days 7
```

## GitHub Token

To download artifacts from GitHub, you need authentication:

1. **Using `gh` CLI** (recommended):
   ```bash
   gh auth login
   ```

2. **Using environment variable**:
   ```bash
   export GITHUB_TOKEN=your_token_here
   ```

Without a token, the dashboard will show run metadata but not detailed benchmark results.

## Data Structure

The metrics JSON has this structure:

```json
{
  "run_id": "21338741812",
  "run_date": "2026-01-25T22:24:02.090218+00:00",
  "commit_sha": "5cdb391...",
  "branch": "main",
  "results": [
    {
      "gpu_config": "8-gpu-h200",
      "partition": 0,
      "model": "deepseek-ai/DeepSeek-V3.1",
      "variant": "TP8+MTP",
      "benchmarks": [
        {
          "batch_size": 1,
          "input_len": 4096,
          "output_len": 512,
          "latency_ms": 2400.72,
          "input_throughput": 21408.64,
          "output_throughput": 231.74,
          "overall_throughput": 1919.43,
          "ttft_ms": 191.32,
          "acc_length": 3.19
        }
      ]
    }
  ]
}
```

## Deployment

### GitHub Pages

The dashboard can be deployed to GitHub Pages for public access:

1. Copy the dashboard files to `docs/performance_dashboard/`
2. Enable GitHub Pages in repository settings
3. Set up a GitHub Action to periodically update metrics data

### Self-Hosted

For a self-hosted deployment with live data:

1. Set up a server running `server.py`
2. Configure a cron job or systemd timer to refresh data
3. Optionally put behind nginx/caddy for SSL

## Metrics Explained

- **Overall Throughput**: Total tokens (input + output) processed per second
- **Input Throughput**: Input tokens processed per second (prefill speed)
- **Output Throughput**: Output tokens generated per second (decode speed)
- **Latency**: End-to-end time to complete the request
- **TTFT**: Time to First Token - time until the first output token
- **Acc Length**: Acceptance length for speculative decoding (MTP variants)

## Contributing

To add support for new metrics or visualizations:

1. Update `fetch_metrics.py` if data collection needs changes
2. Modify `app.js` to add new chart types or filters
3. Update `index.html` for UI changes

## Troubleshooting

**No data displayed**
- Check browser console for errors
- Verify GitHub API is accessible
- Try running with `server.py --fetch-on-start`

**API rate limits**
- Use a GitHub token for higher limits
- The server caches data for 5 minutes

**Charts not rendering**
- Ensure Chart.js is loading from CDN
- Check for JavaScript errors in console