File size: 2,099 Bytes
d93842c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | # [dev_260104_09] Cloud Testing UX - UI-Based LLM Selection
**Date:** 2026-01-04
**Type:** Feature
**Status:** Resolved
**Stage:** [Stage 5: Performance Optimization]
## Problem Description
Testing different LLM providers in HF Spaces cloud requires manually changing environment variables in Space settings, then waiting for rebuild. Slow iteration, poor UX.
---
## Key Decisions
- **UI dropdowns:** Add provider selection in both Test & Debug and Full Evaluation tabs
- **Environment override:** Set os.environ directly from UI selection (overrides .env and HF Space env vars)
- **Toggle fallback:** Checkbox to enable/disable fallback behavior
- **Default strategy:** Groq for testing, fallback enabled for production
---
## Outcome
Cloud testing now much faster - test all 4 providers directly from HF Space UI without rebuild.
**Deliverables:**
- `app.py` - Added UI dropdowns and checkboxes for LLM provider selection in both tabs
## Changelog
**What was changed:**
- **app.py** (~30 lines added/modified)
- Updated `test_single_question()` function signature - Added `llm_provider` and `enable_fallback` parameters
- Sets `os.environ["LLM_PROVIDER"]` from UI selection (overrides .env and HF Space env vars)
- Sets `os.environ["ENABLE_LLM_FALLBACK"]` from UI checkbox
- Adds provider info to diagnostics output
- Updated `run_and_submit_all()` function signature - Added `llm_provider` and `enable_fallback` parameters
- Reordered params: UI inputs first, profile last (optional)
- Sets environment variables before agent initialization
- Added UI components in "Test & Debug" tab:
- `llm_provider_dropdown` - Select from: Gemini, HuggingFace, Groq, Claude (default: Groq)
- `enable_fallback_checkbox` - Toggle fallback behavior (default: false for testing)
- Added UI components in "Full Evaluation" tab:
- `eval_llm_provider_dropdown` - Select LLM for all questions (default: Groq)
- `eval_enable_fallback_checkbox` - Toggle fallback (default: true for production)
- Updated button click handlers to pass new UI inputs to functions
|