agentbee

Sleeping

App Files Files Community

agentbee / dev /dev_260104_09_ui_based_llm_selection.md

mangubee

Update Dev

d93842c 2 months ago

preview code

raw

history blame

2.1 kB

[dev_260104_09] Cloud Testing UX - UI-Based LLM Selection

Date: 2026-01-04 Type: Feature Status: Resolved Stage: [Stage 5: Performance Optimization]

Problem Description

Testing different LLM providers in HF Spaces cloud requires manually changing environment variables in Space settings, then waiting for rebuild. Slow iteration, poor UX.

Key Decisions

UI dropdowns: Add provider selection in both Test & Debug and Full Evaluation tabs
Environment override: Set os.environ directly from UI selection (overrides .env and HF Space env vars)
Toggle fallback: Checkbox to enable/disable fallback behavior
Default strategy: Groq for testing, fallback enabled for production

Outcome

Cloud testing now much faster - test all 4 providers directly from HF Space UI without rebuild.

Deliverables:

app.py - Added UI dropdowns and checkboxes for LLM provider selection in both tabs

Changelog

What was changed:

app.py (~30 lines added/modified)
- Updated test_single_question() function signature - Added llm_provider and enable_fallback parameters
  - Sets os.environ["LLM_PROVIDER"] from UI selection (overrides .env and HF Space env vars)
  - Sets os.environ["ENABLE_LLM_FALLBACK"] from UI checkbox
  - Adds provider info to diagnostics output
- Updated run_and_submit_all() function signature - Added llm_provider and enable_fallback parameters
  - Reordered params: UI inputs first, profile last (optional)
  - Sets environment variables before agent initialization
- Added UI components in "Test & Debug" tab:
  - llm_provider_dropdown - Select from: Gemini, HuggingFace, Groq, Claude (default: Groq)
  - enable_fallback_checkbox - Toggle fallback behavior (default: false for testing)
- Added UI components in "Full Evaluation" tab:
  - eval_llm_provider_dropdown - Select LLM for all questions (default: Groq)
  - eval_enable_fallback_checkbox - Toggle fallback (default: true for production)
- Updated button click handlers to pass new UI inputs to functions