sky2 / configs /README.md
JustinTX's picture
Add files using upload-large-folder tool
725b3aa verified

βš™οΈ Configuration

All settings are YAML. Environment variables can be referenced with ${VAR} syntax in any string value.


πŸ“ Available Config Files

File Search Description
default.yaml Top-K Minimal starting template β€” good for first experiments
adaevolve.yaml AdaEvolve Full multi-island config with adaptive intensity, migration, paradigm breakthroughs, and ablation flags
evox.yaml EvoX Co-evolving solution generation and search strategies
openevolve_native.yaml OpenEvolve Native Native port of OpenEvolve's island-based MAP-Elites search with ring migration
llm_judge.yaml - Demonstrates LLM-as-a-judge evaluation (uses gpt-4o-mini for both generation and judging)
human_in_the_loop.yaml Top-K Enables the live monitor dashboard and human-in-the-loop feedback

Each file is a ready-to-copy template. Fill in the system_message with your problem description and you're good to go.


πŸ”§ Parameter Reference

Top-level

max_iterations: 100        # total evolution iterations
checkpoint_interval: 10    # save checkpoint every N iterations
log_level: "INFO"          # DEBUG / INFO / WARNING
log_dir: null              # directory for logs (default: outputs/)
random_seed: 42
language: null             # auto-detected from initial program ("python", "cpp", "java", etc.)
                           # set to "image" for image-generation mode
file_suffix: ".py"         # output file extension, auto-set from initial program at runtime
diff_based_generation: true # LLM receives diffs instead of full programs
max_solution_length: 60000 # max characters in a program; longer programs are trimmed

llm

llm:
  temperature: 0.7
  top_p: 0.95
  max_tokens: 32000
  timeout: 600           # seconds per LLM request
  retries: 3
  retry_delay: 5         # seconds between retries
  random_seed: null
  reasoning_effort: null # "low" / "medium" / "high" for o-series models

Model specification β€” use provider/model or a bare name (auto-detected for known prefixes):

Provider Format API key env var
OpenAI gpt-5, o3-mini OPENAI_API_KEY
Gemini gemini/gemini-2.0-flash GEMINI_API_KEY or GOOGLE_API_KEY
Anthropic claude-sonnet-4-6 or anthropic/claude-sonnet-4-6 ANTHROPIC_API_KEY
DeepSeek deepseek-chat or deepseek/deepseek-chat DEEPSEEK_API_KEY
Mistral mistral-large or mistral/mistral-large MISTRAL_API_KEY
Ollama / vLLM ollama/llama3, vllm/my-model β€”
Single model, multi-model pool, separate pools, and API override examples

Single model (shorthand):

llm:
  primary_model: "gpt-5"
  primary_model_weight: 1.0

Multi-model pool (weighted sampling):

llm:
  models:
    - name: "gpt-5"
      weight: 0.8
    - name: "anthropic/claude-opus-4-6"
      weight: 0.2

Separate model pools β€” by default all pools share models; override individually:

llm:
  models:
    - name: "gpt-5"
  evaluator_models:   # used by LLM-as-a-judge (evaluator.use_llm_feedback)
    - name: "gpt-4o-mini"
  guide_models:       # used for paradigm breakthroughs and variation labels
    - name: "gpt-4o-mini"

Override API base:

llm:
  api_base: "https://my-proxy.example.com/v1"
  api_key: "${MY_API_KEY}"

You can also set OPENAI_API_BASE or OPENAI_BASE_URL env vars to override the config globally.

search

search:
  type: "adaevolve" # evox | openevolve_native | beam_search | best_of_n | topk
  num_context_programs: 4   # context programs shown to LLMs as examples 
topk β€” no extra settings

Always picks the single best program as parent and the next K as additional context programs.

adaevolve β€” full settings
search:
  type: "adaevolve"
  database:
    population_size: 20
    num_islands: 2

    # Adaptive intensity
    decay: 0.9              # EMA weight for accumulated signal G
    intensity_min: 0.15     # min intensity (exploitation)
    intensity_max: 0.5      # max intensity (exploration)

    # Migration
    migration_interval: 15  # migrate every N iterations
    migration_count: 5      # top programs to copy between islands

    # Archive diversity
    fitness_weight: 1.0     # fitness contribution to elite score
    novelty_weight: 0.0     # novelty contribution to elite score
    diversity_strategy: "code"  # "code" / "metric" / "hybrid"

    # Dynamic island spawning
    use_dynamic_islands: true
    max_islands: 5
    spawn_productivity_threshold: 0.015
    spawn_cooldown_iterations: 30

    # Paradigm breakthrough
    use_paradigm_breakthrough: true
    paradigm_window_size: 10
    paradigm_improvement_threshold: 0.12
    paradigm_num_to_generate: 3
    paradigm_max_uses: 2
    paradigm_max_tried: 10

    # Error retry
    enable_error_retry: true
    max_error_retries: 2

    # Ablation flags (set false to disable)
    use_adaptive_search: true   # G-based intensity; false β†’ use fixed_intensity
    use_ucb_selection: true     # UCB island selection; false β†’ round-robin
    use_migration: true
    use_unified_archive: true   # quality-diversity archive; false β†’ simple list
evox
search:
  type: "evox"
  database:
    auto_generate_variation_operators: true  # by default generate variation operator once 
beam_search
search:
  type: "beam_search"
  database:
    beam_width: 5
    beam_selection_strategy: "diversity_weighted"  # diversity_weighted / stochastic / round_robin / best
    beam_diversity_weight: 0.3
    beam_temperature: 1.0
    beam_depth_penalty: 0.0
best_of_n
search:
  type: "best_of_n"
  database:
    best_of_n: 5  # reuse the same parent for N iterations, then switch to current best
openevolve_native β€” MAP-Elites + island-based evolutionary search

Native port of OpenEvolve's search algorithm. Uses MAP-Elites quality-diversity grid per island with ring-topology migration.

search:
  type: "openevolve_native"
  num_context_programs: 5
  database:
    num_islands: 5
    population_size: 40
    archive_size: 100
    exploration_ratio: 0.2          # P(explore) β€” random from current island
    exploitation_ratio: 0.7         # P(exploit) β€” archive elite, prefer current island
    # remaining 0.1 = P(random)    β€” any program in population
    elite_selection_ratio: 0.1      # fraction of additional context programs from top elites
    feature_dimensions: ["complexity", "diversity"]
    feature_bins: 10
    diversity_reference_size: 20
    migration_interval: 10          # migrate every N island-generations
    migration_rate: 0.1             # fraction of island to migrate
    random_seed: 42

See skydiscover/search/openevolve_native/README.md for architecture details.

prompt

prompt:
  system_message: |
    You are an expert coder helping to improve programs through evolution.

  # system_message can also be a path to a .txt file (relative to the config):
  # system_message: "system_prompt.txt"

  evaluator_system_message: |   # system message for the LLM judge
    You are a strict code quality judge. ...  # only used when evaluator.use_llm_feedback: true

  suggest_simplification_after_chars: 500  # threshold for program labeling in prompts

evaluator

evaluator:
  timeout: 360            # seconds before killing evaluate() subprocess
  max_retries: 3

  # Cascade evaluation: skip expensive full eval on low-scoring programs
  cascade_evaluation: true
  cascade_thresholds: [0.3, 0.6]

  # Prepend evaluator source code (or instruction.md for Harbor tasks)
  # to the LLM system message so the model can see how solutions are scored.
  inject_evaluator_context: false  # default false

  # LLM-as-a-judge
  use_llm_feedback: false
  llm_feedback_weight: 1.0  # relative weight of LLM score in combined_score

agentic

Multi-turn agent that can read files and search the codebase before generating solutions. Enable via --agentic on the CLI or agentic=True in run_discovery(). The codebase root defaults to the initial program's directory; override it here if needed.

agentic:
  enabled: false
  codebase_root: null      # defaults to initial program's directory when omitted
  max_steps: 5             # max tool-call turns per iteration
  per_step_timeout: 60.0   # seconds per tool call
  overall_timeout: 300.0   # total seconds for one agentic generation
  max_context_chars: 400000
  max_file_chars: 50000
  max_files_read: 20
  max_search_results: 50

monitor

Live dashboard served over WebSocket.

monitor:
  enabled: false
  port: 8765
  host: "0.0.0.0"
  max_solution_length: 10000

  # AI-generated run summaries
  summary_model: "gpt-5-mini"
  summary_api_key: null    # falls back to OPENAI_API_KEY
  summary_top_k: 3
  summary_interval: 0      # auto-generate every N programs (0 = manual only)

human_feedback

human_feedback_enabled: false
human_feedback_file: null       # path to a file containing feedback text
human_feedback_mode: "append"   # "append" or "replace"

πŸš€ Getting Started

1. Pick a template β€” copy one of the config files above into your project directory:

cp configs/evox.yaml my_config.yaml

2. Fill in the system message β€” this is the most important field. Tell the LLM what problem it's solving:

prompt:
  system_message: |
    You are an expert at optimizing circle packing algorithms.
    Maximize the number of non-overlapping circles in a unit square.

3. Run with your config:

uv run skydiscover-run initial_program.py evaluator.py -c my_config.yaml -i 100

You can override any config value from the CLI β€” for example, switch the search algorithm or model without editing the YAML:

uv run skydiscover-run initial_program.py evaluator.py \
  -c my_config.yaml \
  --model gemini/gemini-3-pro-preview \
  -i 50