OpenSecOpsEnv2 / docs /OpenEnv_Hackathon_README.md
SapphireGaze429's picture
Please work
b595345
|
Raw
History Blame Contribute Delete
2.59 kB

🧠 OpenEnv Hackathon β€” Judging & Expectations Guide

🚨 TL;DR (What You Actually Need to Do)

Build an environment where an LLM can train and measurably improve at something meaningful, then:

  • Show actual training
  • Provide evidence (metrics, reward curves, comparisons)
  • Tell a clear, compelling story

A messy but ambitious project with real training evidence beats a polished but shallow one.


βš–οΈ Judging Criteria (Core Evaluation)

1. 🌟 Environment Innovation β€” 40%

  • Is your environment novel, creative, or challenging?
  • Does it meaningfully test agent behavior?
  • Avoid overused ideas (grid worlds, chess clones, etc.)

2. 🎀 Storytelling & Presentation β€” 30%

  • Clearly explain:
    • The problem
    • The environment
    • What the agent learned
  • Demo should be engaging and easy to follow

3. πŸ“ˆ Showing Improvement in Rewards β€” 20%

  • Must prove learning happened
  • Evidence:
    • Reward curves
    • Before vs after behavior
    • Baseline comparisons

4. βš™οΈ Reward & Training Pipeline β€” 10%

  • Reward logic should be coherent and hard to exploit
  • Training should improve agent behavior

πŸ“¦ Minimum Submission Requirements

  • Use OpenEnv (latest release)
  • Provide a working training script (Unsloth or HuggingFace TRL)
  • Show training evidence (loss + reward plots)
  • Submit:
    • Mini-blog OR
    • <2 min video OR
    • Slides
  • Host on Hugging Face Spaces
  • Provide a README with problem, environment, results, links

Rules:

  • One submission per team
  • Submit environment URL
  • No changes after deadline

πŸ§ͺ What Judges Look For

πŸ”¬ Real Training

  • Training must run against your environment
  • Show learning with plots, metrics, comparisons

🧠 Reward Design

  • Dense and informative rewards
  • Hard to game
  • Avoid simple binary rewards

πŸš€ Ambitious Problems

  • Solve something LLMs struggle with
  • Prefer underexplored domains

πŸ“Š Clear Results

  • Label axes properly
  • Save plots as images
  • Show comparisons clearly

πŸ“– Tell a Story

Your README should answer:

  1. Problem
  2. Environment
  3. Results
  4. Why it matters

🧹 Clean Engineering

  • Use OpenEnv properly
  • Follow Gym API (reset, step, state)
  • Maintain clean architecture

🧭 Problem Selection Guidelines

  • Reuse Round 1 idea only if it fits themes
  • Build environment + reward model early
  • Ensure alignment with judging criteria

🏁 Final Advice

  • Be ambitious
  • Show real learning
  • Communicate clearly

Judges want projects that push the frontier of LLM training.