[dev_260104_09] Cloud Testing UX - UI-Based LLM Selection
Date: 2026-01-04 Type: Feature Status: Resolved Stage: [Stage 5: Performance Optimization]
Problem Description
Testing different LLM providers in HF Spaces cloud requires manually changing environment variables in Space settings, then waiting for rebuild. Slow iteration, poor UX.
Key Decisions
- UI dropdowns: Add provider selection in both Test & Debug and Full Evaluation tabs
- Environment override: Set os.environ directly from UI selection (overrides .env and HF Space env vars)
- Toggle fallback: Checkbox to enable/disable fallback behavior
- Default strategy: Groq for testing, fallback enabled for production
Outcome
Cloud testing now much faster - test all 4 providers directly from HF Space UI without rebuild.
Deliverables:
app.py- Added UI dropdowns and checkboxes for LLM provider selection in both tabs
Changelog
What was changed:
- app.py (~30 lines added/modified)
- Updated
test_single_question()function signature - Addedllm_providerandenable_fallbackparameters- Sets
os.environ["LLM_PROVIDER"]from UI selection (overrides .env and HF Space env vars) - Sets
os.environ["ENABLE_LLM_FALLBACK"]from UI checkbox - Adds provider info to diagnostics output
- Sets
- Updated
run_and_submit_all()function signature - Addedllm_providerandenable_fallbackparameters- Reordered params: UI inputs first, profile last (optional)
- Sets environment variables before agent initialization
- Added UI components in "Test & Debug" tab:
llm_provider_dropdown- Select from: Gemini, HuggingFace, Groq, Claude (default: Groq)enable_fallback_checkbox- Toggle fallback behavior (default: false for testing)
- Added UI components in "Full Evaluation" tab:
eval_llm_provider_dropdown- Select LLM for all questions (default: Groq)eval_enable_fallback_checkbox- Toggle fallback (default: true for production)
- Updated button click handlers to pass new UI inputs to functions
- Updated