OpenSecOpsEnv2 / docs /OpenEnv_Hackathon_README.md
SapphireGaze429's picture
Please work
b595345
|
Raw
History Blame Contribute Delete
2.59 kB
# 🧠 OpenEnv Hackathon β€” Judging & Expectations Guide
## 🚨 TL;DR (What You Actually Need to Do)
Build an environment where an LLM can **train and measurably improve at something meaningful**, then:
- Show **actual training**
- Provide **evidence (metrics, reward curves, comparisons)**
- Tell a **clear, compelling story**
A messy but ambitious project with real training evidence beats a polished but shallow one.
---
# βš–οΈ Judging Criteria (Core Evaluation)
## 1. 🌟 Environment Innovation β€” 40%
- Is your environment **novel, creative, or challenging**?
- Does it **meaningfully test agent behavior**?
- Avoid overused ideas (grid worlds, chess clones, etc.)
## 2. 🎀 Storytelling & Presentation β€” 30%
- Clearly explain:
- The problem
- The environment
- What the agent learned
- Demo should be engaging and easy to follow
## 3. πŸ“ˆ Showing Improvement in Rewards β€” 20%
- Must prove learning happened
- Evidence:
- Reward curves
- Before vs after behavior
- Baseline comparisons
## 4. βš™οΈ Reward & Training Pipeline β€” 10%
- Reward logic should be coherent and hard to exploit
- Training should improve agent behavior
---
# πŸ“¦ Minimum Submission Requirements
- Use **OpenEnv (latest release)**
- Provide a **working training script** (Unsloth or HuggingFace TRL)
- Show **training evidence** (loss + reward plots)
- Submit:
- Mini-blog OR
- <2 min video OR
- Slides
- Host on **Hugging Face Spaces**
- Provide a **README with problem, environment, results, links**
### Rules:
- One submission per team
- Submit environment URL
- No changes after deadline
---
# πŸ§ͺ What Judges Look For
## πŸ”¬ Real Training
- Training must run against your environment
- Show learning with plots, metrics, comparisons
## 🧠 Reward Design
- Dense and informative rewards
- Hard to game
- Avoid simple binary rewards
## πŸš€ Ambitious Problems
- Solve something LLMs struggle with
- Prefer underexplored domains
## πŸ“Š Clear Results
- Label axes properly
- Save plots as images
- Show comparisons clearly
## πŸ“– Tell a Story
Your README should answer:
1. Problem
2. Environment
3. Results
4. Why it matters
## 🧹 Clean Engineering
- Use OpenEnv properly
- Follow Gym API (reset, step, state)
- Maintain clean architecture
---
# 🧭 Problem Selection Guidelines
- Reuse Round 1 idea only if it fits themes
- Build environment + reward model early
- Ensure alignment with judging criteria
---
# 🏁 Final Advice
- Be ambitious
- Show real learning
- Communicate clearly
Judges want projects that push the frontier of LLM training.