Spaces:
Running on Zero
Running on Zero
add task.md
Browse files
TASK.md
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# PaperProf β Task List
|
| 2 |
+
Build Small Hackathon β June 5β15, 2026
|
| 3 |
+
|
| 4 |
+
## β
Done
|
| 5 |
+
- Project structure created (app.py, core/, model/, ui/)
|
| 6 |
+
- core/parser.py: PDF text extraction with PyMuPDF
|
| 7 |
+
- core/chunker.py: paragraph-based text chunking with min/max word caps
|
| 8 |
+
- core/questioner.py: LLM-based question generation with professor prompt
|
| 9 |
+
- core/evaluator.py: LLM-based answer evaluation with tutor prompt
|
| 10 |
+
- model/llm.py: MiniCPM4-8B singleton wrapper via HuggingFace Transformers
|
| 11 |
+
- app.py: full Gradio UI with PDF upload, question generation, answer evaluation
|
| 12 |
+
- requirements.txt: all dependencies listed
|
| 13 |
+
- README.md: project description and file structure
|
| 14 |
+
- .gitignore: venv, models, cache excluded
|
| 15 |
+
- ZeroGPU enabled on HuggingFace Space (free RTX Pro 6000 Blackwell)
|
| 16 |
+
- Space live at: https://huggingface.co/spaces/build-small-hackathon/PaperProf
|
| 17 |
+
|
| 18 |
+
## π² To Do
|
| 19 |
+
|
| 20 |
+
### π§ͺ Testing & Bug Fixes
|
| 21 |
+
- [ ] End-to-end test: upload real PDF β load β generate question β answer β get feedback
|
| 22 |
+
- [ ] Handle edge cases: empty PDF, scanned PDF (no text), very short PDF
|
| 23 |
+
- [ ] Handle model loading errors gracefully in the UI
|
| 24 |
+
- [ ] Test with multiple PDF types (slides, textbook chapters, lecture notes)
|
| 25 |
+
- [ ] Fix any ZeroGPU cold start issues (model takes time to load)
|
| 26 |
+
|
| 27 |
+
### π¨ UI/Design (Badge: Off-Brand)
|
| 28 |
+
- [ ] Replace default Gradio theme with fully custom CSS using gr.Server
|
| 29 |
+
- [ ] Add PaperProf logo and branding
|
| 30 |
+
- [ ] Add progress indicator when model is generating
|
| 31 |
+
- [ ] Add session score tracker (X/Y correct answers)
|
| 32 |
+
- [ ] Add difficulty selector (Easy / Normal / Hard mode)
|
| 33 |
+
- [ ] Add language selector (French / English)
|
| 34 |
+
- [ ] Show source chunk used for the question (collapsible)
|
| 35 |
+
- [ ] Add end-of-session summary screen with score and weak areas
|
| 36 |
+
- [ ] Make the UI mobile-friendly
|
| 37 |
+
|
| 38 |
+
### π§ ML/Model Improvements
|
| 39 |
+
- [ ] Add quiz modes: Quick (5 questions) / Full session / Brutal mode
|
| 40 |
+
- [ ] Implement adaptive difficulty: revisit failed concepts, skip mastered ones
|
| 41 |
+
- [ ] Add chunk relevance scoring to pick the most important chunks first
|
| 42 |
+
- [ ] Support multiple PDF uploads in one session
|
| 43 |
+
- [ ] Add support for plain text (.txt) and markdown (.md) files
|
| 44 |
+
- [ ] Cache parsed chunks in session state to avoid re-parsing on reload
|
| 45 |
+
|
| 46 |
+
### π
Bonus Quests
|
| 47 |
+
- [ ] π Off the Grid: verify zero external API calls (all local/ZeroGPU)
|
| 48 |
+
- [ ] π― Well-Tuned: fine-tune MiniCPM on educational Q&A data using Modal credits ($250)
|
| 49 |
+
- [ ] Find/create educational Q&A dataset on HuggingFace
|
| 50 |
+
- [ ] Write fine-tuning script with LoRA/QLoRA
|
| 51 |
+
- [ ] Run fine-tuning on Modal GPU
|
| 52 |
+
- [ ] Publish fine-tuned model on HuggingFace under build-small-hackathon org
|
| 53 |
+
- [ ] Update model/llm.py to use fine-tuned model
|
| 54 |
+
- [ ] π¨ Off-Brand: fully custom Gradio frontend (see UI section above)
|
| 55 |
+
- [ ] π¦ Llama Champion: switch inference to llama.cpp runtime
|
| 56 |
+
- [ ] Download GGUF version of MiniCPM4-8B
|
| 57 |
+
- [ ] Replace transformers pipeline with llama-cpp-python
|
| 58 |
+
- [ ] Test performance vs transformers
|
| 59 |
+
- [ ] π‘ Sharing is Caring: export and share agent trace on HuggingFace Hub
|
| 60 |
+
- [ ] π Field Notes: write blog post about what we built and learned
|
| 61 |
+
- [ ] Document architecture decisions
|
| 62 |
+
- [ ] Include benchmark results (speed, quality)
|
| 63 |
+
- [ ] Publish on HuggingFace blog or personal blog
|
| 64 |
+
|
| 65 |
+
### π¦ Submission Checklist (deadline: June 15, 2026)
|
| 66 |
+
- [ ] App running and stable on HuggingFace Space
|
| 67 |
+
- [ ] Demo video (~2 minutes) showing full flow: upload β question β answer β feedback
|
| 68 |
+
- [ ] Social media post (LinkedIn + Twitter) with Space link and demo
|
| 69 |
+
- [ ] Blog post / Field Notes published
|
| 70 |
+
- [ ] Submission form filled on HuggingFace
|
| 71 |
+
- [ ] All badge requirements verified
|