Submission Checklist β AuditRepairEnv++
Deadline: [Your hackathon date]
Status: Pre-submission validation
Pre-Submission Technical Validation
Phase 1: Local Validation β
Before pushing to GitHub, verify locally:
# 1. Test inference script
export HF_TOKEN="hf_your_test_token"
export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
export ENV_BASE_URL="http://localhost:7860"
# Start server in one terminal
python server.py
# In another terminal, test inference
python inference.py
Check:
- β No import errors
- β
[START]printed - β
[STEP]printed per step - β
[END]printed at end - β Rewards formatted to 2 decimals
- β Correct step count
Phase 2: Docker Validation β
# Build Docker image
docker build -t audit-repair-env:latest .
# Run container
docker run -p 7860:7860 \
-e HF_TOKEN="hf_your_token" \
-e API_BASE_URL="https://router.huggingface.co/v1" \
-e MODEL_NAME="Qwen/Qwen2.5-72B-Instruct" \
audit-repair-env:latest
# Test in new terminal
curl -X POST http://localhost:7860/reset \
-d '{"task_id":"easy"}' \
-H "Content-Type: application/json"
Check:
- β Docker builds without errors
- β Container starts
- β
/resetendpoint responds - β Logs visible in container output
Phase 3: File Structure β
project-root/
βββ inference.py β MUST be at root (not subfolder)
βββ requirements.txt β All dependencies listed
βββ README.md β Clear setup + usage
βββ demo.py β Gradio interface
βββ Dockerfile β Present & valid
βββ server.py β Environment server
βββ tasks.py β Task definitions
βββ HF_SPACES_GUIDE.md β Deployment guide
βββ PITCH.md β Project pitch
βββ [other supporting files]
Check:
- β
inference.pyis at project root (notsrc/orapp/) - β
No
.pyfiles in subfolders are namedinference.py - β All files committed to git
- β
.gitignoreexcludes secrets/tokens
Phase 4: inference.py Validation β
# Checklist for inference.py
Environment variables:
- β
Reads
HF_TOKENfromos.getenv("HF_TOKEN") - β Validates HF_TOKEN and raises error if missing
- β
Reads
API_BASE_URLwith default"https://router.huggingface.co/v1" - β
Reads
MODEL_NAMEwith default"Qwen/Qwen2.5-72B-Instruct" - β
Raises
ValueErrorif API_KEY/HF_TOKEN is empty
OpenAI client:
- β
Uses
from openai import OpenAI - β
Creates client:
OpenAI(base_url=API_BASE_URL, api_key=API_KEY) - β
No raw
urllibcalls for LLM - β No alternate SDKs (not requests, httpx, etc.)
Output format:
- β
Prints
[START]at beginning - β
Prints
[START]\nTask: <task> - β
Prints
[STEP]after each action - β
Prints
[STEP]\nAction: <action>\nReward: <value> - β
Rewards formatted to 2 decimals:
{reward:.2f} - β
Booleans as lowercase:
true/false(notTrue/False) - β
Prints
[END]afterenv.close()or on exception - β
Prints
[END]\nFinal Score: <score> - β Step count matches actual steps executed
Example valid output:
[START]
Task: easy
[STEP]
Action: FIX_ENTRY 1
Reward: 0.10
[STEP]
Action: FIX_ENTRY 3
Reward: 0.15
[STEP]
Action: NO_OP
Reward: 0.00
[END]
Final Score: 0.85
Phase 5: requirements.txt β
pip install -r requirements.txt
Check:
- β No syntax errors
- β
Contains:
openai>=1.30.0(for OpenAI client) - β
Contains:
fastapi>=0.111.0(for server) - β
Contains:
pydantic>=2.7.0(for models) - β
Contains:
uvicorn[standard]>=0.29.0(for serving) - β
Contains:
gradio>=4.0.0(for demo) - β No unnecessary packages (keep lean)
Phase 6: README.md β
Required sections:
- β Title: "AuditRepairEnv++"
- β Problem description (what problem does it solve?)
- β Solution overview (how does it work?)
- β Task explanation (easy/medium/hard)
- β Setup instructions (local, Docker)
- β
How to run
inference.py - β Baseline results / example output
- β HF Spaces deployment steps
- β Troubleshooting section
- β License (MIT)
Writing checklist:
- β Clear and concise
- β Code examples work
- β Commands are tested
- β No broken links
Phase 7: demo.py Validation β
export HF_TOKEN="hf_your_token"
python demo.py
Check:
- β Gradio interface loads
- β
Accessible at
http://localhost:7860 - β Task dropdown selects (easy/medium/hard)
- β "Run Inference" button works
- β Output displays in textbox
- β Dark/minimal aesthetic visible
- β No JavaScript errors in browser console
Phase 8: Dockerfile β
Valid Dockerfile structure:
FROM python:3.10-slim # β
Specified base image
WORKDIR /app # β
Set working directory
COPY . . # β
Copy code
RUN pip install -r requirements.txt # β
Install deps
EXPOSE 7860 # β
Expose Gradio port
CMD ["python", "demo.py"] # β
Entry point
Check:
- β
Base image specified (e.g.,
python:3.10-slim) - β Working directory set
- β
Dependencies installed with
pip install - β Port exposed (7860)
- β Entry CMD specified
- β No hardcoded tokens/secrets
- β
.dockerignoreexcludes unnecessary files
GitHub Repository
Phase 1: Repository Setup β
git init
git add .
git commit -m "Initial commit"
git remote add origin https://github.com/YOUR_USERNAME/audit-repair-env.git
git push -u origin main
Check:
- β Repository is PUBLIC
- β All code is committed
- β
.gitignoreincludes.env,*.key,secrets/ - β No API keys in git history
- β README visible on repo homepage
- β Dockerfile present
Phase 2: Repository Contents β
β
inference.py
β
server.py
β
tasks.py
β
demo.py
β
requirements.txt
β
Dockerfile
β
README.md
β
HF_SPACES_GUIDE.md
β
PITCH.md
β
.gitignore
β
LICENSE (MIT)
Check:
- β 10+ commits (show development history)
- β No personal info in commits
- β Meaningful commit messages
Hugging Face Spaces Deployment
Phase 1: Spaces Creation β
Fill:
- Owner: Your HF username
- Space name:
audit-repair-env - License: MIT
- SDK: Docker β IMPORTANT
Click "Create Space"
Check:
- β Space is created
- β Space is PUBLIC
- β
URL format:
https://huggingface.co/spaces/your-username/audit-repair-env
Phase 2: GitHub Integration β
In Space Settings:
- Scroll to "Linked Repository"
- Click "Link a repository"
- Select:
your-username/audit-repair-env - Choose "Sync" mode (auto-rebuild on push)
Check:
- β GitHub repo linked
- β Sync enabled
- β
Branch:
main
Phase 3: Environment Secrets β
In Space Settings β Repository secrets:
HF_TOKEN = hf_actual_valid_token_here
API_BASE_URL = https://router.huggingface.co/v1
MODEL_NAME = Qwen/Qwen2.5-72B-Instruct
Check:
- β HF_TOKEN is valid and has API permissions
- β Secrets are NOT visible in logs
- β Each secret on separate line
Phase 4: Build & Deploy β
- Go to Space
- Click "Logs" tab
- Wait 5-10 minutes for build
- Status changes from "Building" β "Running"
Check:
- β Build succeeds (no errors in logs)
- β Status is "Running"
- β
No warning signs:
- β
ImportError - β
ModuleNotFoundError - β
HF_TOKEN not set - β
Connection refused
- β
Phase 5: Test Spaces β
- Click "App" link in Space
- You should see Gradio interface
- Try:
- Select "easy" task
- Click "Run Inference"
- Wait for results
Check:
- β Gradio interface loads
- β No 502/504 errors
- β Inference completes (5-30 sec depending on model)
- β Output displays correctly
- β Dark aesthetic visible
Phase 6: Share Link β
Your Space public URL:
https://huggingface.co/spaces/your-username/audit-repair-env
Check:
- β URL is accessible
- β Anyone can view (no login required)
- β App runs without errors
Submission Content
README Content Checklist
β Title & Description
# AuditRepairEnv++
Budget-constrained RL for financial ledger repair
β Problem Statement
- Why does this matter?
- What real-world problem does it solve?
β Solution Overview
- What is AuditRepairEnv++?
- How does it work?
β Technical Details
- Observation space (JSON format)
- Action space (FIX_ENTRY, ADJUST_ENTRY, etc.)
- Reward function (how scoring works)
β Tasks
- Easy (5-8 entries)
- Medium (15-20 entries)
- Hard (30+ entries, hidden dependencies)
β Setup Instructions
pip install -r requirements.txt
export HF_TOKEN="hf_..."
python inference.py
β Results / Baseline
| Task | Score |
|---|---|
| easy | 0.90 |
| medium | 0.70 |
| hard | 0.55 |
β Deployment
- Local:
python inference.py - Docker:
docker build . && docker run ... - HF Spaces: [link to Space]
β License MIT License
Pitch Content Checklist
β 30-second pitch (problem + solution + impact)
β 2-minute pitch (structured narrative)
β Technical pitch (for engineers/judges)
β Key metrics (success rate, efficiency, etc.)
β Real-world application (why it matters)
β Comparison (vs. other benchmarks/solutions)
β Demo script (how to show it off)
Final Quality Checks
Code Quality
- β No syntax errors
- β Follows PEP 8 (somewhat)
- β Comments explain non-obvious logic
- β Error handling (try/except for network calls)
- β No hardcoded secrets/tokens
- β All imports are used
Documentation Quality
- β Clear and concise
- β Code examples are tested
- β Instructions are step-by-step
- β Troubleshooting section included
- β No typos or grammar errors
- β Links are not broken
User Experience
- β Gradio interface is intuitive
- β Dark theme is applied
- β Output is readable
- β Error messages are helpful
- β Demo runs quickly (<30 sec)
Submission Completeness
- β All required files present
- β GitHub repo is public
- β HF Spaces is running
- β README is comprehensive
- β Pitch is compelling
- β No sensitive data exposed
Submission Checklist (Final)
Before you submit to the hackathon:
Day Before Deadline
- Code: All local tests pass
- GitHub: All code pushed and repo is public
- HF Spaces: Build is complete and Space is running
- README: Updated with all required sections
- PITCH: Prepared and tested
- Demo: Works end-to-end without errors
Day Of Deadline
Verify Links
- GitHub URL works: https://github.com/your-username/audit-repair-env
- HF Spaces URL works: https://huggingface.co/spaces/your-username/audit-repair-env
- Both are public/accessible
Test One More Time
- Inference script runs:
python inference.py - Docker builds:
docker build . - Demo loads in browser
- Output format is correct
- Inference script runs:
Prepare Presentation
- Pitch slides ready
- Demo script prepared (which tasks to show)
- Metrics/results visible
- Story arc is clear
Submit
- GitHub URL submitted
- HF Spaces URL submitted
- README linked
- Team members credited
- All deadlines met
Red Flags (π© Don't Do These)
β File Structure
src/inference.pyβ Must be at root!app/inference.pyβ Must be at root!- Multiple
inference.pyfiles β Keep only one at root
β Missing Validation
- HF_TOKEN not validated
- Missing default values
- Using
openaibut not installed in requirements.txt
β Output Format
- Missing
[START],[STEP], or[END] - Rewards not to 2 decimals
- Booleans as
True/Falseinstead oftrue/false - Step count doesn't match
β Deployment
- HF Spaces build fails (broken logs tab)
- Space is private
- HF_TOKEN is hardcoded in Dockerfile
- Port is not 7860
β Documentation
- No README
- Pitch is unclear
- No setup instructions
- Broken links
Success Criteria
β Technical
-
inference.pyat root validates and runs - Output format is exactly correct
- HF_TOKEN validation works
- Docker builds successfully
β Documentation
- README explains problem & solution
- Setup instructions are clear
- Pitch is compelling
β Deployment
- GitHub repo is public
- HF Spaces is running and accessible
- Demo works end-to-end
β Quality
- Code has no obvious bugs
- Output is readable
- Instructions work (tested by someone else ideally)
Resources
- README.md β Environment documentation
- PITCH.md β How to pitch the project
- HF_SPACES_GUIDE.md β Detailed deployment guide
- inference.py β Submission script
- GitHub β Where to host code
- Hugging Face Spaces β Where to deploy
Contact / Support
- Questions: Check HF_SPACES_GUIDE.md for troubleshooting
- Issues: File bug reports on GitHub
- Feedback: Help improve the environment!
Last updated: April 2025
Status: Ready for submission β
π Print this checklist and check off as you go!