| # Submission Checklist β AuditRepairEnv++ | |
| **Deadline**: [Your hackathon date] | |
| **Status**: Pre-submission validation | |
| --- | |
| ## Pre-Submission Technical Validation | |
| ### Phase 1: Local Validation β | |
| Before pushing to GitHub, verify locally: | |
| ```bash | |
| # 1. Test inference script | |
| export HF_TOKEN="hf_your_test_token" | |
| export API_BASE_URL="https://router.huggingface.co/v1" | |
| export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct" | |
| export ENV_BASE_URL="http://localhost:7860" | |
| # Start server in one terminal | |
| python server.py | |
| # In another terminal, test inference | |
| python inference.py | |
| ``` | |
| **Check**: | |
| - β No import errors | |
| - β `[START]` printed | |
| - β `[STEP]` printed per step | |
| - β `[END]` printed at end | |
| - β Rewards formatted to 2 decimals | |
| - β Correct step count | |
| ### Phase 2: Docker Validation β | |
| ```bash | |
| # Build Docker image | |
| docker build -t audit-repair-env:latest . | |
| # Run container | |
| docker run -p 7860:7860 \ | |
| -e HF_TOKEN="hf_your_token" \ | |
| -e API_BASE_URL="https://router.huggingface.co/v1" \ | |
| -e MODEL_NAME="Qwen/Qwen2.5-72B-Instruct" \ | |
| audit-repair-env:latest | |
| # Test in new terminal | |
| curl -X POST http://localhost:7860/reset \ | |
| -d '{"task_id":"easy"}' \ | |
| -H "Content-Type: application/json" | |
| ``` | |
| **Check**: | |
| - β Docker builds without errors | |
| - β Container starts | |
| - β `/reset` endpoint responds | |
| - β Logs visible in container output | |
| ### Phase 3: File Structure β | |
| ``` | |
| project-root/ | |
| βββ inference.py β MUST be at root (not subfolder) | |
| βββ requirements.txt β All dependencies listed | |
| βββ README.md β Clear setup + usage | |
| βββ demo.py β Gradio interface | |
| βββ Dockerfile β Present & valid | |
| βββ server.py β Environment server | |
| βββ tasks.py β Task definitions | |
| βββ HF_SPACES_GUIDE.md β Deployment guide | |
| βββ PITCH.md β Project pitch | |
| βββ [other supporting files] | |
| ``` | |
| **Check**: | |
| - β `inference.py` is at project root (not `src/` or `app/`) | |
| - β No `.py` files in subfolders are named `inference.py` | |
| - β All files committed to git | |
| - β `.gitignore` excludes secrets/tokens | |
| ### Phase 4: inference.py Validation β | |
| ```python | |
| # Checklist for inference.py | |
| ``` | |
| **Environment variables**: | |
| - β Reads `HF_TOKEN` from `os.getenv("HF_TOKEN")` | |
| - β **Validates** HF_TOKEN and raises error if missing | |
| - β Reads `API_BASE_URL` with default `"https://router.huggingface.co/v1"` | |
| - β Reads `MODEL_NAME` with default `"Qwen/Qwen2.5-72B-Instruct"` | |
| - β Raises `ValueError` if API_KEY/HF_TOKEN is empty | |
| **OpenAI client**: | |
| - β Uses `from openai import OpenAI` | |
| - β Creates client: `OpenAI(base_url=API_BASE_URL, api_key=API_KEY)` | |
| - β No raw `urllib` calls for LLM | |
| - β No alternate SDKs (not requests, httpx, etc.) | |
| **Output format**: | |
| - β Prints `[START]` at beginning | |
| - β Prints `[START]\nTask: <task>` | |
| - β Prints `[STEP]` after each action | |
| - β Prints `[STEP]\nAction: <action>\nReward: <value>` | |
| - β Rewards formatted to 2 decimals: `{reward:.2f}` | |
| - β Booleans as lowercase: `true` / `false` (not `True` / `False`) | |
| - β Prints `[END]` after `env.close()` or on exception | |
| - β Prints `[END]\nFinal Score: <score>` | |
| - β Step count matches actual steps executed | |
| **Example valid output**: | |
| ``` | |
| [START] | |
| Task: easy | |
| [STEP] | |
| Action: FIX_ENTRY 1 | |
| Reward: 0.10 | |
| [STEP] | |
| Action: FIX_ENTRY 3 | |
| Reward: 0.15 | |
| [STEP] | |
| Action: NO_OP | |
| Reward: 0.00 | |
| [END] | |
| Final Score: 0.85 | |
| ``` | |
| ### Phase 5: requirements.txt β | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| **Check**: | |
| - β No syntax errors | |
| - β Contains: `openai>=1.30.0` (for OpenAI client) | |
| - β Contains: `fastapi>=0.111.0` (for server) | |
| - β Contains: `pydantic>=2.7.0` (for models) | |
| - β Contains: `uvicorn[standard]>=0.29.0` (for serving) | |
| - β Contains: `gradio>=4.0.0` (for demo) | |
| - β No unnecessary packages (keep lean) | |
| ### Phase 6: README.md β | |
| **Required sections**: | |
| - β Title: "AuditRepairEnv++" | |
| - β Problem description (what problem does it solve?) | |
| - β Solution overview (how does it work?) | |
| - β Task explanation (easy/medium/hard) | |
| - β Setup instructions (local, Docker) | |
| - β How to run `inference.py` | |
| - β Baseline results / example output | |
| - β HF Spaces deployment steps | |
| - β Troubleshooting section | |
| - β License (MIT) | |
| **Writing checklist**: | |
| - β Clear and concise | |
| - β Code examples work | |
| - β Commands are tested | |
| - β No broken links | |
| ### Phase 7: demo.py Validation β | |
| ```bash | |
| export HF_TOKEN="hf_your_token" | |
| python demo.py | |
| ``` | |
| **Check**: | |
| - β Gradio interface loads | |
| - β Accessible at `http://localhost:7860` | |
| - β Task dropdown selects (easy/medium/hard) | |
| - β "Run Inference" button works | |
| - β Output displays in textbox | |
| - β Dark/minimal aesthetic visible | |
| - β No JavaScript errors in browser console | |
| ### Phase 8: Dockerfile β | |
| **Valid Dockerfile structure**: | |
| ```dockerfile | |
| FROM python:3.10-slim # β Specified base image | |
| WORKDIR /app # β Set working directory | |
| COPY . . # β Copy code | |
| RUN pip install -r requirements.txt # β Install deps | |
| EXPOSE 7860 # β Expose Gradio port | |
| CMD ["python", "demo.py"] # β Entry point | |
| ``` | |
| **Check**: | |
| - β Base image specified (e.g., `python:3.10-slim`) | |
| - β Working directory set | |
| - β Dependencies installed with `pip install` | |
| - β Port exposed (7860) | |
| - β Entry CMD specified | |
| - β No hardcoded tokens/secrets | |
| - β `.dockerignore` excludes unnecessary files | |
| --- | |
| ## GitHub Repository | |
| ### Phase 1: Repository Setup β | |
| ```bash | |
| git init | |
| git add . | |
| git commit -m "Initial commit" | |
| git remote add origin https://github.com/YOUR_USERNAME/audit-repair-env.git | |
| git push -u origin main | |
| ``` | |
| **Check**: | |
| - β Repository is **PUBLIC** | |
| - β All code is committed | |
| - β `.gitignore` includes `.env`, `*.key`, `secrets/` | |
| - β No API keys in git history | |
| - β README visible on repo homepage | |
| - β Dockerfile present | |
| ### Phase 2: Repository Contents β | |
| ``` | |
| β inference.py | |
| β server.py | |
| β tasks.py | |
| β demo.py | |
| β requirements.txt | |
| β Dockerfile | |
| β README.md | |
| β HF_SPACES_GUIDE.md | |
| β PITCH.md | |
| β .gitignore | |
| β LICENSE (MIT) | |
| ``` | |
| **Check**: | |
| - β 10+ commits (show development history) | |
| - β No personal info in commits | |
| - β Meaningful commit messages | |
| --- | |
| ## Hugging Face Spaces Deployment | |
| ### Phase 1: Spaces Creation β | |
| 1. Go to [huggingface.co/spaces/create](https://huggingface.co/spaces/create) | |
| 2. Fill: | |
| - **Owner**: Your HF username | |
| - **Space name**: `audit-repair-env` | |
| - **License**: MIT | |
| - **SDK**: Docker β **IMPORTANT** | |
| 3. Click **"Create Space"** | |
| **Check**: | |
| - β Space is created | |
| - β Space is PUBLIC | |
| - β URL format: `https://huggingface.co/spaces/your-username/audit-repair-env` | |
| ### Phase 2: GitHub Integration β | |
| In **Space Settings**: | |
| 1. Scroll to **"Linked Repository"** | |
| 2. Click **"Link a repository"** | |
| 3. Select: `your-username/audit-repair-env` | |
| 4. Choose **"Sync"** mode (auto-rebuild on push) | |
| **Check**: | |
| - β GitHub repo linked | |
| - β Sync enabled | |
| - β Branch: `main` | |
| ### Phase 3: Environment Secrets β | |
| In **Space Settings β Repository secrets**: | |
| ``` | |
| HF_TOKEN = hf_actual_valid_token_here | |
| API_BASE_URL = https://router.huggingface.co/v1 | |
| MODEL_NAME = Qwen/Qwen2.5-72B-Instruct | |
| ``` | |
| **Check**: | |
| - β HF_TOKEN is valid and has API permissions | |
| - β Secrets are NOT visible in logs | |
| - β Each secret on separate line | |
| ### Phase 4: Build & Deploy β | |
| 1. Go to Space | |
| 2. Click **"Logs"** tab | |
| 3. Wait 5-10 minutes for build | |
| 4. Status changes from **"Building"** β **"Running"** | |
| **Check**: | |
| - β Build succeeds (no errors in logs) | |
| - β Status is **"Running"** | |
| - β No warning signs: | |
| - β `ImportError` | |
| - β `ModuleNotFoundError` | |
| - β `HF_TOKEN not set` | |
| - β `Connection refused` | |
| ### Phase 5: Test Spaces β | |
| 1. Click **"App"** link in Space | |
| 2. You should see Gradio interface | |
| 3. Try: | |
| - Select "easy" task | |
| - Click "Run Inference" | |
| - Wait for results | |
| **Check**: | |
| - β Gradio interface loads | |
| - β No 502/504 errors | |
| - β Inference completes (5-30 sec depending on model) | |
| - β Output displays correctly | |
| - β Dark aesthetic visible | |
| ### Phase 6: Share Link β | |
| Your Space public URL: | |
| ``` | |
| https://huggingface.co/spaces/your-username/audit-repair-env | |
| ``` | |
| **Check**: | |
| - β URL is accessible | |
| - β Anyone can view (no login required) | |
| - β App runs without errors | |
| --- | |
| ## Submission Content | |
| ### README Content Checklist | |
| β **Title & Description** | |
| ```markdown | |
| # AuditRepairEnv++ | |
| Budget-constrained RL for financial ledger repair | |
| ``` | |
| β **Problem Statement** | |
| - Why does this matter? | |
| - What real-world problem does it solve? | |
| β **Solution Overview** | |
| - What is AuditRepairEnv++? | |
| - How does it work? | |
| β **Technical Details** | |
| - Observation space (JSON format) | |
| - Action space (FIX_ENTRY, ADJUST_ENTRY, etc.) | |
| - Reward function (how scoring works) | |
| β **Tasks** | |
| - Easy (5-8 entries) | |
| - Medium (15-20 entries) | |
| - Hard (30+ entries, hidden dependencies) | |
| β **Setup Instructions** | |
| ```bash | |
| pip install -r requirements.txt | |
| export HF_TOKEN="hf_..." | |
| python inference.py | |
| ``` | |
| β **Results / Baseline** | |
| | Task | Score | | |
| |------|-------| | |
| | easy | 0.90 | | |
| | medium | 0.70 | | |
| | hard | 0.55 | | |
| β **Deployment** | |
| - Local: `python inference.py` | |
| - Docker: `docker build . && docker run ...` | |
| - HF Spaces: [link to Space] | |
| β **License** | |
| MIT License | |
| ### Pitch Content Checklist | |
| β **30-second pitch** (problem + solution + impact) | |
| β **2-minute pitch** (structured narrative) | |
| β **Technical pitch** (for engineers/judges) | |
| β **Key metrics** (success rate, efficiency, etc.) | |
| β **Real-world application** (why it matters) | |
| β **Comparison** (vs. other benchmarks/solutions) | |
| β **Demo script** (how to show it off) | |
| --- | |
| ## Final Quality Checks | |
| ### Code Quality | |
| - β No syntax errors | |
| - β Follows PEP 8 (somewhat) | |
| - β Comments explain non-obvious logic | |
| - β Error handling (try/except for network calls) | |
| - β No hardcoded secrets/tokens | |
| - β All imports are used | |
| ### Documentation Quality | |
| - β Clear and concise | |
| - β Code examples are tested | |
| - β Instructions are step-by-step | |
| - β Troubleshooting section included | |
| - β No typos or grammar errors | |
| - β Links are not broken | |
| ### User Experience | |
| - β Gradio interface is intuitive | |
| - β Dark theme is applied | |
| - β Output is readable | |
| - β Error messages are helpful | |
| - β Demo runs quickly (<30 sec) | |
| ### Submission Completeness | |
| - β All required files present | |
| - β GitHub repo is public | |
| - β HF Spaces is running | |
| - β README is comprehensive | |
| - β Pitch is compelling | |
| - β No sensitive data exposed | |
| --- | |
| ## Submission Checklist (Final) | |
| Before you submit to the hackathon: | |
| ### Day Before Deadline | |
| - [ ] **Code**: All local tests pass | |
| - [ ] **GitHub**: All code pushed and repo is public | |
| - [ ] **HF Spaces**: Build is complete and Space is running | |
| - [ ] **README**: Updated with all required sections | |
| - [ ] **PITCH**: Prepared and tested | |
| - [ ] **Demo**: Works end-to-end without errors | |
| ### Day Of Deadline | |
| - [ ] **Verify Links** | |
| - [ ] GitHub URL works: https://github.com/your-username/audit-repair-env | |
| - [ ] HF Spaces URL works: https://huggingface.co/spaces/your-username/audit-repair-env | |
| - [ ] Both are public/accessible | |
| - [ ] **Test One More Time** | |
| - [ ] Inference script runs: `python inference.py` | |
| - [ ] Docker builds: `docker build .` | |
| - [ ] Demo loads in browser | |
| - [ ] Output format is correct | |
| - [ ] **Prepare Presentation** | |
| - [ ] Pitch slides ready | |
| - [ ] Demo script prepared (which tasks to show) | |
| - [ ] Metrics/results visible | |
| - [ ] Story arc is clear | |
| - [ ] **Submit** | |
| - [ ] GitHub URL submitted | |
| - [ ] HF Spaces URL submitted | |
| - [ ] README linked | |
| - [ ] Team members credited | |
| - [ ] All deadlines met | |
| --- | |
| ## Red Flags (π© Don't Do These) | |
| β **File Structure** | |
| - `src/inference.py` β Must be at root! | |
| - `app/inference.py` β Must be at root! | |
| - Multiple `inference.py` files β Keep only one at root | |
| β **Missing Validation** | |
| - HF_TOKEN not validated | |
| - Missing default values | |
| - Using `openai` but not installed in requirements.txt | |
| β **Output Format** | |
| - Missing `[START]`, `[STEP]`, or `[END]` | |
| - Rewards not to 2 decimals | |
| - Booleans as `True`/`False` instead of `true`/`false` | |
| - Step count doesn't match | |
| β **Deployment** | |
| - HF Spaces build fails (broken logs tab) | |
| - Space is private | |
| - HF_TOKEN is hardcoded in Dockerfile | |
| - Port is not 7860 | |
| β **Documentation** | |
| - No README | |
| - Pitch is unclear | |
| - No setup instructions | |
| - Broken links | |
| --- | |
| ## Success Criteria | |
| β **Technical** | |
| - [ ] `inference.py` at root validates and runs | |
| - [ ] Output format is exactly correct | |
| - [ ] HF_TOKEN validation works | |
| - [ ] Docker builds successfully | |
| β **Documentation** | |
| - [ ] README explains problem & solution | |
| - [ ] Setup instructions are clear | |
| - [ ] Pitch is compelling | |
| β **Deployment** | |
| - [ ] GitHub repo is public | |
| - [ ] HF Spaces is running and accessible | |
| - [ ] Demo works end-to-end | |
| β **Quality** | |
| - [ ] Code has no obvious bugs | |
| - [ ] Output is readable | |
| - [ ] Instructions work (tested by someone else ideally) | |
| --- | |
| ## Resources | |
| - [README.md](./README.md) β Environment documentation | |
| - [PITCH.md](./PITCH.md) β How to pitch the project | |
| - [HF_SPACES_GUIDE.md](./HF_SPACES_GUIDE.md) β Detailed deployment guide | |
| - [inference.py](./inference.py) β Submission script | |
| - [GitHub](https://github.com) β Where to host code | |
| - [Hugging Face Spaces](https://huggingface.co/spaces) β Where to deploy | |
| --- | |
| ## Contact / Support | |
| - **Questions**: Check HF_SPACES_GUIDE.md for troubleshooting | |
| - **Issues**: File bug reports on GitHub | |
| - **Feedback**: Help improve the environment! | |
| --- | |
| **Last updated**: April 2025 | |
| **Status**: Ready for submission β | |
| --- | |
| **π Print this checklist and check off as you go!** | |