| # Shinka Configuration Guide ⚙️ |
|
|
| This guide covers the comprehensive configuration system in Shinka, including all parameters, file structures, and advanced configuration patterns. |
|
|
| ## Table of Contents |
|
|
| 1. [Core Configuration Components](#core-configuration-components) |
| 2. [Configuration Parameters](#configuration-parameters) |
| 3. [Pre-configured Variants](#pre-configured-variants) |
| 4. [Configuration Structure](#configuration-structure) |
| 5. [Creating Custom Configurations](#creating-custom-configurations) |
| 6. [Advanced Configuration Patterns](#advanced-configuration-patterns) |
| 7. [Configuration Examples](#configuration-examples) |
| 8. [Configuration Best Practices](#configuration-best-practices) |
|
|
|
|
| ## Core Configuration Components |
|
|
| ### 1. Evolution Config (`evo_config`) |
| |
| Controls the core evolutionary algorithm parameters: |
| |
| ```yaml |
| evo_config: |
| _target_: shinka.core.EvolutionConfig |
| num_generations: 20 # Number of evolution generations |
| max_parallel_jobs: 1 # Maximum parallel evaluations |
| max_patch_attempts: 10 # Max attempts to generate valid patches |
| |
| # LLM Configuration |
| llm_models: # List of LLM models for mutations |
| - "azure-gpt-4.1" |
| llm_dynamic_selection: null # Dynamic model selection strategy |
| embedding_model: "text-embedding-3-small" |
| |
| # Patch Configuration |
| patch_types: # Types of code modifications |
| - "diff" # Diff-based patches |
| - "full" # Full code replacement |
| patch_type_probs: # Probabilities for each patch type |
| - 0.5 |
| - 0.5 |
| |
| # Task Configuration |
| language: "python" # Programming language |
| init_program_path: "???" # Path to initial program |
| task_sys_msg: "???" # System message for LLM |
| job_type: "local" # Job execution type |
| results_dir: ${output_dir} # Results directory |
| ``` |
| |
| ### 2. Database Config (`db_config`) |
|
|
| Manages the evolutionary database and island topology: |
|
|
| ```yaml |
| db_config: |
| _target_: shinka.database.DatabaseConfig |
| db_path: "evolution_db.sqlite" # SQLite database path |
| |
| # Island Configuration |
| num_islands: 2 # Number of evolutionary islands |
| island_elitism: true # Enable elite preservation per island |
| |
| # Archive Configuration |
| archive_size: 20 # Size of elite solution archive |
| num_archive_inspirations: 4 # Solutions drawn from archive |
| num_top_k_inspirations: 2 # Solutions from current generation |
| |
| # Selection and Migration |
| exploitation_ratio: 0.2 # Exploitation vs exploration balance |
| elite_selection_ratio: 0.3 # Fraction of elites for selection |
| migration_interval: 10 # Generations between migrations |
| migration_rate: 0.1 # Fraction of population migrated |
| ``` |
|
|
| ### 3. Job Config (`job_config`) |
| |
| Defines the execution environment and resource requirements: |
| |
| #### Local Execution |
| ```yaml |
| job_config: |
| _target_: shinka.launch.LocalJobConfig |
| eval_program_path: "shinka/evaluate.py" |
| ``` |
| |
| #### Slurm Cluster Execution |
| ```yaml |
| job_config: |
| _target_: shinka.launch.SlurmCondaJobConfig |
| modules: # Environment modules |
| - "cuda/12.4" |
| - "cudnn/8.9.7" |
| - "hpcx/2.20" |
| eval_program_path: "shinka/utils/eval_hydra.py" |
| conda_env: "shinka" # Conda environment name |
| time: "01:00:00" # Maximum job runtime |
| cpus: 4 # CPU cores per job |
| gpus: 1 # GPUs per job |
| mem: "16G" # Memory per job |
| ``` |
| |
| ### 4. Task Config |
| |
| Defines problem-specific settings and evaluation functions: |
| |
| ```yaml |
| # Task-specific evaluation function |
| evaluate_function: |
| _target_: examples.my_task.evaluate.main |
| program_path: ??? # Filled by runner |
| results_dir: ??? # Filled by runner |
| |
| # Job configuration for this task |
| distributed_job_config: |
| _target_: shinka.launch.SlurmCondaJobConfig |
| # ... resource requirements ... |
| |
| # Evolution settings specific to this task |
| evo_config: |
| task_sys_msg: | |
| You are an expert in [domain]. |
| Key insights: [domain knowledge] |
| language: "python" |
| init_program_path: "examples/my_task/initial.py" |
| job_type: "slurm_conda" |
| |
| exp_name: "shinka_my_task" |
| ``` |
| |
| ## Configuration Parameters |
| |
| ### Evolution Parameters |
| |
| | Parameter | Type | Default | Description | |
| |-----------|------|---------|-------------| |
| | `num_generations` | int | 20 | Number of evolutionary generations | |
| | `max_parallel_jobs` | int | 1 | Maximum concurrent evaluations | |
| | `max_patch_attempts` | int | 10 | Maximum attempts to generate valid patches | |
| | `llm_models` | list | `["azure-gpt-4.1"]` | LLM models for mutations | |
| | `patch_types` | list | `["diff", "full"]` | Types of code modifications | |
| | `patch_type_probs` | list | `[0.5, 0.5]` | Probabilities for patch types | |
| | `language` | str | `"python"` | Programming language | |
| | `embedding_model` | str | `"text-embedding-3-small"` | Model for code embeddings | |
|
|
| ### Database Parameters |
|
|
| | Parameter | Type | Default | Description | |
| |-----------|------|---------|-------------| |
| | `num_islands` | int | 2 | Number of evolutionary islands | |
| | `archive_size` | int | 20 | Size of elite solution archive | |
| | `num_archive_inspirations` | int | 4 | Solutions drawn from archive | |
| | `num_top_k_inspirations` | int | 2 | Solutions from current generation | |
| | `exploitation_ratio` | float | 0.2 | Balance between exploitation/exploration | |
| | `elite_selection_ratio` | float | 0.3 | Fraction of elites for selection | |
| | `migration_interval` | int | 10 | Generations between island migrations | |
| | `migration_rate` | float | 0.1 | Fraction of population migrated | |
| | `island_elitism` | bool | true | Preserve elites per island | |
|
|
| ### Resource Parameters |
|
|
| | Parameter | Type | Default | Description | |
| |-----------|------|---------|-------------| |
| | `time` | str | `"01:00:00"` | Maximum job runtime (HH:MM:SS) | |
| | `cpus` | int | 4 | CPU cores per job | |
| | `gpus` | int | 0 | GPUs per job | |
| | `mem` | str | `"8G"` | Memory per job | |
| | `conda_env` | str | `"shinka"` | Conda environment name | |
| | `modules` | list | `[]` | Environment modules to load | |
|
|
| ## Pre-configured Variants |
|
|
| Shinka uses [Hydra](https://hydra.cc/) for flexible, hierarchical configuration management. The system is designed around composable configuration files that can be mixed and matched to create different experimental setups. |
|
|
| Variants provide pre-configured combinations of settings for common use cases: |
|
|
| ### Circle Packing Example |
| ```yaml |
| # configs/variant/circle_packing_example.yaml |
| defaults: |
| - override /database@_global_: island_large |
| - override /evolution@_global_: large_budget |
| - override /task@_global_: circle_packing |
| - override /cluster@_global_: local |
| - _self_ |
| |
| variant_suffix: "_example" |
| ``` |
|
|
| ### Agent Design Example |
| ```yaml |
| # configs/variant/agent_design_example.yaml |
| defaults: |
| - override /database@_global_: island_medium |
| - override /evolution@_global_: medium_budget |
| - override /task@_global_: agent_design |
| - override /cluster@_global_: local |
| - _self_ |
| |
| evo_config: |
| num_generations: 15 |
| |
| variant_suffix: "_agent_example" |
| ``` |
|
|
| ## Configuration Structure |
|
|
| ``` |
| configs/ |
| ├── config.yaml # Main config file with defaults |
| ├── cluster/ # Execution environments |
| │ ├── local.yaml # Local execution |
| │ ├── gcp.yaml # Google Cloud Platform |
| │ └── remote.yaml # Remote Slurm clusters |
| ├── database/ # Evolution database settings |
| │ ├── island_small.yaml # Small-scale evolution (2 islands) |
| │ ├── island_medium.yaml# Medium-scale evolution (4 islands) |
| │ └── island_large.yaml # Large-scale evolution (8+ islands) |
| ├── evolution/ # Evolution parameters |
| │ ├── small_budget.yaml # Few generations, quick runs |
| │ ├── medium_budget.yaml# Moderate computational budget |
| │ └── large_budget.yaml # Extensive evolution runs |
| ├── task/ # Problem definitions |
| │ ├── circle_packing.yaml |
| │ ├── agent_design.yaml |
| │ ├── bbo_search.yaml |
| │ ├── cifar10.yaml |
| │ ├── cuda_optim.yaml |
| │ ├── mad_moe.yaml |
| │ └── novelty_generator.yaml |
| └── variant/ # Pre-configured combinations |
| ├── circle_packing_example.yaml |
| ├── agent_design_example.yaml |
| ├── mad_moe_example.yaml |
| └── default.yaml |
| ``` |
|
|
| ## Creating Custom Configurations |
|
|
| ### Method 1: Custom Variant File |
|
|
| Create a new variant file combining existing components: |
|
|
| ```yaml |
| # configs/variant/my_custom_variant.yaml |
| defaults: |
| - override /database@_global_: island_small |
| - override /evolution@_global_: small_budget |
| - override /task@_global_: my_task |
| - override /cluster@_global_: local |
| - _self_ |
| |
| # Override specific parameters |
| evo_config: |
| num_generations: 25 |
| max_parallel_jobs: 2 |
| |
| db_config: |
| archive_size: 30 |
| migration_interval: 5 |
| |
| variant_suffix: "_custom" |
| ``` |
|
|
| Launch with: |
| ```bash |
| shinka_launch variant=my_custom_variant |
| ``` |
|
|
| ### Method 2: Command Line Overrides |
|
|
| Override parameters directly on the command line: |
|
|
| ```bash |
| shinka_launch \ |
| task=circle_packing \ |
| database=island_large \ |
| evolution=medium_budget \ |
| cluster=local \ |
| evo_config.num_generations=50 \ |
| evo_config.max_parallel_jobs=4 \ |
| db_config.num_islands=6 \ |
| variant_suffix="_custom_run" |
| ``` |
|
|
| ### Method 3: Custom Task Configuration |
|
|
| Create a new task configuration: |
|
|
| ```yaml |
| # configs/task/my_optimization_task.yaml |
| evaluate_function: |
| _target_: examples.my_optimization.evaluate.main |
| program_path: ??? |
| results_dir: ??? |
| |
| distributed_job_config: |
| _target_: shinka.launch.LocalJobConfig |
| eval_program_path: "shinka/utils/eval_hydra.py" |
| |
| evo_config: |
| task_sys_msg: | |
| You are an expert optimization researcher working on [specific problem]. |
| |
| Key insights to consider: |
| 1. [Domain-specific insight 1] |
| 2. [Domain-specific insight 2] |
| 3. [Domain-specific insight 3] |
| |
| Focus on [specific optimization goals]. |
| language: "python" |
| init_program_path: "examples/my_optimization/initial.py" |
| job_type: "local" |
| |
| exp_name: "shinka_my_optimization" |
| ``` |
|
|
| ## Advanced Configuration Patterns |
|
|
| ### Multi-Model Evolution |
|
|
| Use multiple LLM models with different strengths: |
|
|
| ```yaml |
| evo_config: |
| llm_models: |
| - "azure-gpt-4.1" # Strong reasoning |
| - "claude-3-sonnet" # Good at code |
| - "azure-gpt-4o-mini" # Fast iterations |
| |
| # Optional: Dynamic model selection |
| llm_dynamic_selection: |
| strategy: "performance_based" |
| window_size: 10 |
| ``` |
|
|
| ## Configuration Examples |
|
|
| ### Quick Prototyping Setup |
| ```yaml |
| # Fast iteration for development |
| defaults: |
| - override /database@_global_: island_small |
| - override /evolution@_global_: small_budget |
| - override /cluster@_global_: local |
| |
| evo_config: |
| num_generations: 5 |
| max_parallel_jobs: 1 |
| |
| db_config: |
| num_islands: 1 |
| archive_size: 10 |
| |
| variant_suffix: "_prototype" |
| ``` |
|
|
| ### Production Research Setup |
| ```yaml |
| # Large-scale research experiment |
| defaults: |
| - override /database@_global_: island_large |
| - override /evolution@_global_: large_budget |
| - override /cluster@_global_: remote |
| |
| evo_config: |
| num_generations: 100 |
| max_parallel_jobs: 8 |
| |
| db_config: |
| num_islands: 8 |
| archive_size: 50 |
| migration_interval: 5 |
| |
| variant_suffix: "_production" |
| ``` |
|
|
| ### Multi-Task Comparison |
| ```yaml |
| # Configuration for comparing across tasks |
| defaults: |
| - override /database@_global_: island_medium |
| - override /evolution@_global_: medium_budget |
| - override /cluster@_global_: local |
| |
| # Standardized parameters for fair comparison |
| evo_config: |
| num_generations: 30 |
| max_parallel_jobs: 2 |
| llm_models: ["azure-gpt-4.1"] |
| |
| db_config: |
| num_islands: 4 |
| archive_size: 25 |
| |
| variant_suffix: "_comparison" |
| ``` |
|
|
| ## Configuration Best Practices |
|
|
| ### 1. Start Small, Scale Up |
| - Begin with `island_small` and `small_budget` configurations |
| - Increase complexity as you understand the problem better |
|
|
| ### 2. Use Meaningful Variant Suffixes |
| - Include key parameters in the suffix: `_gen50_islands4_gpt4` |
| - This helps identify experiments in results directories |
|
|
| ### 3. Document Custom Configurations |
| - Add comments explaining parameter choices |
| - Include expected runtime and resource usage |
|
|
| ### 4. Version Control Configurations |
| - Keep variant files in version control |
| - Tag configurations used for important results |
|
|
| ### 5. Monitor Resource Usage |
| - Start with conservative resource allocations |
| - Monitor actual usage and adjust accordingly |
|
|
| For more examples and detailed parameter explanations, see the configuration files in the `configs/` directory and the [Getting Started Guide](getting_started.md). |
|
|