Team_Sparks / docs /SUBMISSION_CHECKLIST.md
KeithXD's picture
Upload folder using huggingface_hub
4702dbb verified
# Submission Checklist β€” AuditRepairEnv++
**Deadline**: [Your hackathon date]
**Status**: Pre-submission validation
---
## Pre-Submission Technical Validation
### Phase 1: Local Validation βœ…
Before pushing to GitHub, verify locally:
```bash
# 1. Test inference script
export HF_TOKEN="hf_your_test_token"
export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
export ENV_BASE_URL="http://localhost:7860"
# Start server in one terminal
python server.py
# In another terminal, test inference
python inference.py
```
**Check**:
- βœ… No import errors
- βœ… `[START]` printed
- βœ… `[STEP]` printed per step
- βœ… `[END]` printed at end
- βœ… Rewards formatted to 2 decimals
- βœ… Correct step count
### Phase 2: Docker Validation βœ…
```bash
# Build Docker image
docker build -t audit-repair-env:latest .
# Run container
docker run -p 7860:7860 \
-e HF_TOKEN="hf_your_token" \
-e API_BASE_URL="https://router.huggingface.co/v1" \
-e MODEL_NAME="Qwen/Qwen2.5-72B-Instruct" \
audit-repair-env:latest
# Test in new terminal
curl -X POST http://localhost:7860/reset \
-d '{"task_id":"easy"}' \
-H "Content-Type: application/json"
```
**Check**:
- βœ… Docker builds without errors
- βœ… Container starts
- βœ… `/reset` endpoint responds
- βœ… Logs visible in container output
### Phase 3: File Structure βœ…
```
project-root/
β”œβ”€β”€ inference.py ← MUST be at root (not subfolder)
β”œβ”€β”€ requirements.txt ← All dependencies listed
β”œβ”€β”€ README.md ← Clear setup + usage
β”œβ”€β”€ demo.py ← Gradio interface
β”œβ”€β”€ Dockerfile ← Present & valid
β”œβ”€β”€ server.py ← Environment server
β”œβ”€β”€ tasks.py ← Task definitions
β”œβ”€β”€ HF_SPACES_GUIDE.md ← Deployment guide
β”œβ”€β”€ PITCH.md ← Project pitch
└── [other supporting files]
```
**Check**:
- βœ… `inference.py` is at project root (not `src/` or `app/`)
- βœ… No `.py` files in subfolders are named `inference.py`
- βœ… All files committed to git
- βœ… `.gitignore` excludes secrets/tokens
### Phase 4: inference.py Validation βœ…
```python
# Checklist for inference.py
```
**Environment variables**:
- βœ… Reads `HF_TOKEN` from `os.getenv("HF_TOKEN")`
- βœ… **Validates** HF_TOKEN and raises error if missing
- βœ… Reads `API_BASE_URL` with default `"https://router.huggingface.co/v1"`
- βœ… Reads `MODEL_NAME` with default `"Qwen/Qwen2.5-72B-Instruct"`
- βœ… Raises `ValueError` if API_KEY/HF_TOKEN is empty
**OpenAI client**:
- βœ… Uses `from openai import OpenAI`
- βœ… Creates client: `OpenAI(base_url=API_BASE_URL, api_key=API_KEY)`
- βœ… No raw `urllib` calls for LLM
- βœ… No alternate SDKs (not requests, httpx, etc.)
**Output format**:
- βœ… Prints `[START]` at beginning
- βœ… Prints `[START]\nTask: <task>`
- βœ… Prints `[STEP]` after each action
- βœ… Prints `[STEP]\nAction: <action>\nReward: <value>`
- βœ… Rewards formatted to 2 decimals: `{reward:.2f}`
- βœ… Booleans as lowercase: `true` / `false` (not `True` / `False`)
- βœ… Prints `[END]` after `env.close()` or on exception
- βœ… Prints `[END]\nFinal Score: <score>`
- βœ… Step count matches actual steps executed
**Example valid output**:
```
[START]
Task: easy
[STEP]
Action: FIX_ENTRY 1
Reward: 0.10
[STEP]
Action: FIX_ENTRY 3
Reward: 0.15
[STEP]
Action: NO_OP
Reward: 0.00
[END]
Final Score: 0.85
```
### Phase 5: requirements.txt βœ…
```bash
pip install -r requirements.txt
```
**Check**:
- βœ… No syntax errors
- βœ… Contains: `openai>=1.30.0` (for OpenAI client)
- βœ… Contains: `fastapi>=0.111.0` (for server)
- βœ… Contains: `pydantic>=2.7.0` (for models)
- βœ… Contains: `uvicorn[standard]>=0.29.0` (for serving)
- βœ… Contains: `gradio>=4.0.0` (for demo)
- βœ… No unnecessary packages (keep lean)
### Phase 6: README.md βœ…
**Required sections**:
- βœ… Title: "AuditRepairEnv++"
- βœ… Problem description (what problem does it solve?)
- βœ… Solution overview (how does it work?)
- βœ… Task explanation (easy/medium/hard)
- βœ… Setup instructions (local, Docker)
- βœ… How to run `inference.py`
- βœ… Baseline results / example output
- βœ… HF Spaces deployment steps
- βœ… Troubleshooting section
- βœ… License (MIT)
**Writing checklist**:
- βœ… Clear and concise
- βœ… Code examples work
- βœ… Commands are tested
- βœ… No broken links
### Phase 7: demo.py Validation βœ…
```bash
export HF_TOKEN="hf_your_token"
python demo.py
```
**Check**:
- βœ… Gradio interface loads
- βœ… Accessible at `http://localhost:7860`
- βœ… Task dropdown selects (easy/medium/hard)
- βœ… "Run Inference" button works
- βœ… Output displays in textbox
- βœ… Dark/minimal aesthetic visible
- βœ… No JavaScript errors in browser console
### Phase 8: Dockerfile βœ…
**Valid Dockerfile structure**:
```dockerfile
FROM python:3.10-slim # βœ… Specified base image
WORKDIR /app # βœ… Set working directory
COPY . . # βœ… Copy code
RUN pip install -r requirements.txt # βœ… Install deps
EXPOSE 7860 # βœ… Expose Gradio port
CMD ["python", "demo.py"] # βœ… Entry point
```
**Check**:
- βœ… Base image specified (e.g., `python:3.10-slim`)
- βœ… Working directory set
- βœ… Dependencies installed with `pip install`
- βœ… Port exposed (7860)
- βœ… Entry CMD specified
- βœ… No hardcoded tokens/secrets
- βœ… `.dockerignore` excludes unnecessary files
---
## GitHub Repository
### Phase 1: Repository Setup βœ…
```bash
git init
git add .
git commit -m "Initial commit"
git remote add origin https://github.com/YOUR_USERNAME/audit-repair-env.git
git push -u origin main
```
**Check**:
- βœ… Repository is **PUBLIC**
- βœ… All code is committed
- βœ… `.gitignore` includes `.env`, `*.key`, `secrets/`
- βœ… No API keys in git history
- βœ… README visible on repo homepage
- βœ… Dockerfile present
### Phase 2: Repository Contents βœ…
```
βœ… inference.py
βœ… server.py
βœ… tasks.py
βœ… demo.py
βœ… requirements.txt
βœ… Dockerfile
βœ… README.md
βœ… HF_SPACES_GUIDE.md
βœ… PITCH.md
βœ… .gitignore
βœ… LICENSE (MIT)
```
**Check**:
- βœ… 10+ commits (show development history)
- βœ… No personal info in commits
- βœ… Meaningful commit messages
---
## Hugging Face Spaces Deployment
### Phase 1: Spaces Creation βœ…
1. Go to [huggingface.co/spaces/create](https://huggingface.co/spaces/create)
2. Fill:
- **Owner**: Your HF username
- **Space name**: `audit-repair-env`
- **License**: MIT
- **SDK**: Docker ← **IMPORTANT**
3. Click **"Create Space"**
**Check**:
- βœ… Space is created
- βœ… Space is PUBLIC
- βœ… URL format: `https://huggingface.co/spaces/your-username/audit-repair-env`
### Phase 2: GitHub Integration βœ…
In **Space Settings**:
1. Scroll to **"Linked Repository"**
2. Click **"Link a repository"**
3. Select: `your-username/audit-repair-env`
4. Choose **"Sync"** mode (auto-rebuild on push)
**Check**:
- βœ… GitHub repo linked
- βœ… Sync enabled
- βœ… Branch: `main`
### Phase 3: Environment Secrets βœ…
In **Space Settings β†’ Repository secrets**:
```
HF_TOKEN = hf_actual_valid_token_here
API_BASE_URL = https://router.huggingface.co/v1
MODEL_NAME = Qwen/Qwen2.5-72B-Instruct
```
**Check**:
- βœ… HF_TOKEN is valid and has API permissions
- βœ… Secrets are NOT visible in logs
- βœ… Each secret on separate line
### Phase 4: Build & Deploy βœ…
1. Go to Space
2. Click **"Logs"** tab
3. Wait 5-10 minutes for build
4. Status changes from **"Building"** β†’ **"Running"**
**Check**:
- βœ… Build succeeds (no errors in logs)
- βœ… Status is **"Running"**
- βœ… No warning signs:
- ❌ `ImportError`
- ❌ `ModuleNotFoundError`
- ❌ `HF_TOKEN not set`
- ❌ `Connection refused`
### Phase 5: Test Spaces βœ…
1. Click **"App"** link in Space
2. You should see Gradio interface
3. Try:
- Select "easy" task
- Click "Run Inference"
- Wait for results
**Check**:
- βœ… Gradio interface loads
- βœ… No 502/504 errors
- βœ… Inference completes (5-30 sec depending on model)
- βœ… Output displays correctly
- βœ… Dark aesthetic visible
### Phase 6: Share Link βœ…
Your Space public URL:
```
https://huggingface.co/spaces/your-username/audit-repair-env
```
**Check**:
- βœ… URL is accessible
- βœ… Anyone can view (no login required)
- βœ… App runs without errors
---
## Submission Content
### README Content Checklist
βœ… **Title & Description**
```markdown
# AuditRepairEnv++
Budget-constrained RL for financial ledger repair
```
βœ… **Problem Statement**
- Why does this matter?
- What real-world problem does it solve?
βœ… **Solution Overview**
- What is AuditRepairEnv++?
- How does it work?
βœ… **Technical Details**
- Observation space (JSON format)
- Action space (FIX_ENTRY, ADJUST_ENTRY, etc.)
- Reward function (how scoring works)
βœ… **Tasks**
- Easy (5-8 entries)
- Medium (15-20 entries)
- Hard (30+ entries, hidden dependencies)
βœ… **Setup Instructions**
```bash
pip install -r requirements.txt
export HF_TOKEN="hf_..."
python inference.py
```
βœ… **Results / Baseline**
| Task | Score |
|------|-------|
| easy | 0.90 |
| medium | 0.70 |
| hard | 0.55 |
βœ… **Deployment**
- Local: `python inference.py`
- Docker: `docker build . && docker run ...`
- HF Spaces: [link to Space]
βœ… **License**
MIT License
### Pitch Content Checklist
βœ… **30-second pitch** (problem + solution + impact)
βœ… **2-minute pitch** (structured narrative)
βœ… **Technical pitch** (for engineers/judges)
βœ… **Key metrics** (success rate, efficiency, etc.)
βœ… **Real-world application** (why it matters)
βœ… **Comparison** (vs. other benchmarks/solutions)
βœ… **Demo script** (how to show it off)
---
## Final Quality Checks
### Code Quality
- βœ… No syntax errors
- βœ… Follows PEP 8 (somewhat)
- βœ… Comments explain non-obvious logic
- βœ… Error handling (try/except for network calls)
- βœ… No hardcoded secrets/tokens
- βœ… All imports are used
### Documentation Quality
- βœ… Clear and concise
- βœ… Code examples are tested
- βœ… Instructions are step-by-step
- βœ… Troubleshooting section included
- βœ… No typos or grammar errors
- βœ… Links are not broken
### User Experience
- βœ… Gradio interface is intuitive
- βœ… Dark theme is applied
- βœ… Output is readable
- βœ… Error messages are helpful
- βœ… Demo runs quickly (<30 sec)
### Submission Completeness
- βœ… All required files present
- βœ… GitHub repo is public
- βœ… HF Spaces is running
- βœ… README is comprehensive
- βœ… Pitch is compelling
- βœ… No sensitive data exposed
---
## Submission Checklist (Final)
Before you submit to the hackathon:
### Day Before Deadline
- [ ] **Code**: All local tests pass
- [ ] **GitHub**: All code pushed and repo is public
- [ ] **HF Spaces**: Build is complete and Space is running
- [ ] **README**: Updated with all required sections
- [ ] **PITCH**: Prepared and tested
- [ ] **Demo**: Works end-to-end without errors
### Day Of Deadline
- [ ] **Verify Links**
- [ ] GitHub URL works: https://github.com/your-username/audit-repair-env
- [ ] HF Spaces URL works: https://huggingface.co/spaces/your-username/audit-repair-env
- [ ] Both are public/accessible
- [ ] **Test One More Time**
- [ ] Inference script runs: `python inference.py`
- [ ] Docker builds: `docker build .`
- [ ] Demo loads in browser
- [ ] Output format is correct
- [ ] **Prepare Presentation**
- [ ] Pitch slides ready
- [ ] Demo script prepared (which tasks to show)
- [ ] Metrics/results visible
- [ ] Story arc is clear
- [ ] **Submit**
- [ ] GitHub URL submitted
- [ ] HF Spaces URL submitted
- [ ] README linked
- [ ] Team members credited
- [ ] All deadlines met
---
## Red Flags (🚩 Don't Do These)
❌ **File Structure**
- `src/inference.py` β€” Must be at root!
- `app/inference.py` β€” Must be at root!
- Multiple `inference.py` files β€” Keep only one at root
❌ **Missing Validation**
- HF_TOKEN not validated
- Missing default values
- Using `openai` but not installed in requirements.txt
❌ **Output Format**
- Missing `[START]`, `[STEP]`, or `[END]`
- Rewards not to 2 decimals
- Booleans as `True`/`False` instead of `true`/`false`
- Step count doesn't match
❌ **Deployment**
- HF Spaces build fails (broken logs tab)
- Space is private
- HF_TOKEN is hardcoded in Dockerfile
- Port is not 7860
❌ **Documentation**
- No README
- Pitch is unclear
- No setup instructions
- Broken links
---
## Success Criteria
βœ… **Technical**
- [ ] `inference.py` at root validates and runs
- [ ] Output format is exactly correct
- [ ] HF_TOKEN validation works
- [ ] Docker builds successfully
βœ… **Documentation**
- [ ] README explains problem & solution
- [ ] Setup instructions are clear
- [ ] Pitch is compelling
βœ… **Deployment**
- [ ] GitHub repo is public
- [ ] HF Spaces is running and accessible
- [ ] Demo works end-to-end
βœ… **Quality**
- [ ] Code has no obvious bugs
- [ ] Output is readable
- [ ] Instructions work (tested by someone else ideally)
---
## Resources
- [README.md](./README.md) β€” Environment documentation
- [PITCH.md](./PITCH.md) β€” How to pitch the project
- [HF_SPACES_GUIDE.md](./HF_SPACES_GUIDE.md) β€” Detailed deployment guide
- [inference.py](./inference.py) β€” Submission script
- [GitHub](https://github.com) β€” Where to host code
- [Hugging Face Spaces](https://huggingface.co/spaces) β€” Where to deploy
---
## Contact / Support
- **Questions**: Check HF_SPACES_GUIDE.md for troubleshooting
- **Issues**: File bug reports on GitHub
- **Feedback**: Help improve the environment!
---
**Last updated**: April 2025
**Status**: Ready for submission βœ…
---
**πŸ“‹ Print this checklist and check off as you go!**