Sami Marreed
feat: docker-v1 with optimized frontend
0646b18
# Profiling Quick Start
## 1. Set Environment Variables
```bash
export LANGFUSE_PUBLIC_KEY="pk-..."
export LANGFUSE_SECRET_KEY="sk-..."
```
Or add to `.env` file in project root.
## 2. Run an Experiment
```bash
# Run default experiment (fast vs balanced)
./system_tests/profiling/run_experiment.sh
# Compare different providers
./system_tests/profiling/run_experiment.sh --config providers_comparison.yaml
# Compare all modes for one provider
./system_tests/profiling/run_experiment.sh --config fast_vs_accurate.yaml
# Full matrix: providers Γ— modes
./system_tests/profiling/run_experiment.sh --config full_matrix_comparison.yaml
```
## 3. View Results
```bash
# Start HTTP server and open browser
./system_tests/profiling/serve.sh --open
# Or just start server (visit http://localhost:8080/comparison.html)
./system_tests/profiling/serve.sh
```
## Comparison Types
### Mode Comparison (Same Provider)
Compare fast vs balanced vs accurate modes using the same LLM provider.
Example output files: `fast_20250930.json`, `balanced_20250930.json`, `accurate_20250930.json`
### Provider Comparison (Same Mode)
Compare OpenAI vs Azure vs WatsonX using the same mode (e.g., balanced).
Example output files: `openai_balanced_20250930.json`, `azure_balanced_20250930.json`, `watsonx_balanced_20250930.json`
### Full Matrix Comparison
Compare all combinations of providers and modes (2 providers Γ— 2 modes = 4 experiments).
Example output files: `openai_fast_20250930.json`, `openai_balanced_20250930.json`, `azure_fast_20250930.json`, `azure_balanced_20250930.json`
## Available Scripts
| Script | Purpose |
|--------|---------|
| `run_experiment.sh` | Run profiling experiments with YAML config |
| `serve.sh` | Start HTTP server to view results |
| `bin/run_profiling.sh` | Lower-level profiling script with CLI args |
| `bin/profile_digital_sales_tasks.py` | Core Python profiling tool |
## Configuration Files
Located in `config/`:
- `default_experiment.yaml` - Fast vs Balanced comparison
- `fast_vs_accurate.yaml` - Fast vs Accurate comparison
- `providers_comparison.yaml` - OpenAI vs Azure vs WatsonX (same mode)
- `full_matrix_comparison.yaml` - Full provider Γ— mode matrix
- `.secrets.yaml` - Your Langfuse credentials (git-ignored)
## Example: Provider Comparison
Create or use `config/providers_comparison.yaml`:
```yaml
experiment:
name: "providers_comparison"
runs:
- name: "openai_balanced"
test_id: "settings.openai.toml:balanced:test_get_top_account_by_revenue_stream"
iterations: 3
output: "experiments/openai_balanced_{{timestamp}}.json"
- name: "azure_balanced"
test_id: "settings.azure.toml:balanced:test_get_top_account_by_revenue_stream"
iterations: 3
output: "experiments/azure_balanced_{{timestamp}}.json"
```
Then run:
```bash
./system_tests/profiling/run_experiment.sh --config providers_comparison.yaml
./system_tests/profiling/serve.sh --open
```
## Color Coding in Charts
The comparison HTML automatically color-codes experiments:
**Modes:**
- Fast = Green 🟒
- Balanced = Blue πŸ”΅
- Accurate = Orange 🟠
**Providers:**
- OpenAI = Teal 🟦
- Azure = Azure Blue πŸ’™
- WatsonX = IBM Blue πŸ”΅
**Combined Labels** (e.g., `openai_balanced`) get colors based on provider first, then mode.
## Directory Structure
```
system_tests/profiling/
β”œβ”€β”€ run_experiment.sh # Main entry point
β”œβ”€β”€ serve.sh # View results
β”œβ”€β”€ bin/ # Internal scripts
β”œβ”€β”€ config/ # YAML configurations
β”œβ”€β”€ experiments/ # Results + HTML viewer
└── reports/ # Individual reports
```
## Tips
- πŸ’‘ HTML auto-loads all JSON files in experiments/
- πŸ’‘ Naming format: `{provider}_{mode}_{timestamp}.json` or `{mode}_{timestamp}.json`
- πŸ’‘ CLI args override YAML config settings
- πŸ’‘ Use `{{timestamp}}` in output paths for unique files
- πŸ’‘ Retry mechanism handles Langfuse propagation delays
- πŸ’‘ Stop server with Ctrl+C
For full documentation, see `README.md`.