Ryadg commited on
Commit
76e80aa
Β·
1 Parent(s): bae929c

add task.md

Browse files
Files changed (1) hide show
  1. TASK.md +71 -0
TASK.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PaperProf β€” Task List
2
+ Build Small Hackathon β€” June 5–15, 2026
3
+
4
+ ## βœ… Done
5
+ - Project structure created (app.py, core/, model/, ui/)
6
+ - core/parser.py: PDF text extraction with PyMuPDF
7
+ - core/chunker.py: paragraph-based text chunking with min/max word caps
8
+ - core/questioner.py: LLM-based question generation with professor prompt
9
+ - core/evaluator.py: LLM-based answer evaluation with tutor prompt
10
+ - model/llm.py: MiniCPM4-8B singleton wrapper via HuggingFace Transformers
11
+ - app.py: full Gradio UI with PDF upload, question generation, answer evaluation
12
+ - requirements.txt: all dependencies listed
13
+ - README.md: project description and file structure
14
+ - .gitignore: venv, models, cache excluded
15
+ - ZeroGPU enabled on HuggingFace Space (free RTX Pro 6000 Blackwell)
16
+ - Space live at: https://huggingface.co/spaces/build-small-hackathon/PaperProf
17
+
18
+ ## πŸ”² To Do
19
+
20
+ ### πŸ§ͺ Testing & Bug Fixes
21
+ - [ ] End-to-end test: upload real PDF β†’ load β†’ generate question β†’ answer β†’ get feedback
22
+ - [ ] Handle edge cases: empty PDF, scanned PDF (no text), very short PDF
23
+ - [ ] Handle model loading errors gracefully in the UI
24
+ - [ ] Test with multiple PDF types (slides, textbook chapters, lecture notes)
25
+ - [ ] Fix any ZeroGPU cold start issues (model takes time to load)
26
+
27
+ ### 🎨 UI/Design (Badge: Off-Brand)
28
+ - [ ] Replace default Gradio theme with fully custom CSS using gr.Server
29
+ - [ ] Add PaperProf logo and branding
30
+ - [ ] Add progress indicator when model is generating
31
+ - [ ] Add session score tracker (X/Y correct answers)
32
+ - [ ] Add difficulty selector (Easy / Normal / Hard mode)
33
+ - [ ] Add language selector (French / English)
34
+ - [ ] Show source chunk used for the question (collapsible)
35
+ - [ ] Add end-of-session summary screen with score and weak areas
36
+ - [ ] Make the UI mobile-friendly
37
+
38
+ ### 🧠 ML/Model Improvements
39
+ - [ ] Add quiz modes: Quick (5 questions) / Full session / Brutal mode
40
+ - [ ] Implement adaptive difficulty: revisit failed concepts, skip mastered ones
41
+ - [ ] Add chunk relevance scoring to pick the most important chunks first
42
+ - [ ] Support multiple PDF uploads in one session
43
+ - [ ] Add support for plain text (.txt) and markdown (.md) files
44
+ - [ ] Cache parsed chunks in session state to avoid re-parsing on reload
45
+
46
+ ### πŸ… Bonus Quests
47
+ - [ ] πŸ”Œ Off the Grid: verify zero external API calls (all local/ZeroGPU)
48
+ - [ ] 🎯 Well-Tuned: fine-tune MiniCPM on educational Q&A data using Modal credits ($250)
49
+ - [ ] Find/create educational Q&A dataset on HuggingFace
50
+ - [ ] Write fine-tuning script with LoRA/QLoRA
51
+ - [ ] Run fine-tuning on Modal GPU
52
+ - [ ] Publish fine-tuned model on HuggingFace under build-small-hackathon org
53
+ - [ ] Update model/llm.py to use fine-tuned model
54
+ - [ ] 🎨 Off-Brand: fully custom Gradio frontend (see UI section above)
55
+ - [ ] πŸ¦™ Llama Champion: switch inference to llama.cpp runtime
56
+ - [ ] Download GGUF version of MiniCPM4-8B
57
+ - [ ] Replace transformers pipeline with llama-cpp-python
58
+ - [ ] Test performance vs transformers
59
+ - [ ] πŸ“‘ Sharing is Caring: export and share agent trace on HuggingFace Hub
60
+ - [ ] πŸ““ Field Notes: write blog post about what we built and learned
61
+ - [ ] Document architecture decisions
62
+ - [ ] Include benchmark results (speed, quality)
63
+ - [ ] Publish on HuggingFace blog or personal blog
64
+
65
+ ### πŸ“¦ Submission Checklist (deadline: June 15, 2026)
66
+ - [ ] App running and stable on HuggingFace Space
67
+ - [ ] Demo video (~2 minutes) showing full flow: upload β†’ question β†’ answer β†’ feedback
68
+ - [ ] Social media post (LinkedIn + Twitter) with Space link and demo
69
+ - [ ] Blog post / Field Notes published
70
+ - [ ] Submission form filled on HuggingFace
71
+ - [ ] All badge requirements verified