File size: 3,718 Bytes
76e80aa
 
 
 
 
 
 
 
 
83ec3f5
76e80aa
 
 
 
 
 
 
 
 
 
 
83ec3f5
 
76e80aa
 
 
 
 
 
 
83ec3f5
76e80aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dc3b7d4
76e80aa
 
 
 
 
 
 
 
dc3b7d4
76e80aa
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# PaperProf β€” Task List
Build Small Hackathon β€” June 5–15, 2026

## βœ… Done
- Project structure created (app.py, core/, model/, ui/)
- core/parser.py: PDF text extraction with PyMuPDF
- core/chunker.py: paragraph-based text chunking with min/max word caps
- core/questioner.py: LLM-based question generation with professor prompt
- core/evaluator.py: LLM-based answer evaluation with tutor prompt
- model/llm.py: MiniCPM4-8B singleton wrapper via HuggingFace Transformers (bfloat16, chat template)
- app.py: full Gradio UI with PDF upload, question generation, answer evaluation
- requirements.txt: all dependencies listed
- README.md: project description and file structure
- .gitignore: venv, models, cache excluded
- ZeroGPU enabled on HuggingFace Space (free RTX Pro 6000 Blackwell)
- Space live at: https://huggingface.co/spaces/build-small-hackathon/PaperProf

## πŸ”² To Do

### πŸ§ͺ Testing & Bug Fixes
- [ ] End-to-end test: upload real PDF β†’ load β†’ generate question β†’ answer β†’ get feedback
- [x] Handle edge cases: empty PDF, scanned PDF (no text), very short PDF
- [x] Handle model loading errors gracefully in the UI
- [ ] Test with multiple PDF types (slides, textbook chapters, lecture notes)
- [ ] Fix any ZeroGPU cold start issues (model takes time to load)

### 🎨 UI/Design (Badge: Off-Brand)
- [ ] Replace default Gradio theme with fully custom CSS using gr.Server
- [ ] Add PaperProf logo and branding
- [ ] Add progress indicator when model is generating
- [x] Add session score tracker (X/Y correct answers)
- [ ] Add difficulty selector (Easy / Normal / Hard mode)
- [ ] Add language selector (French / English)
- [ ] Show source chunk used for the question (collapsible)
- [ ] Add end-of-session summary screen with score and weak areas
- [ ] Make the UI mobile-friendly

### 🧠 ML/Model Improvements
- [ ] Add quiz modes: Quick (5 questions) / Full session / Brutal mode
- [ ] Implement adaptive difficulty: revisit failed concepts, skip mastered ones
- [ ] Add chunk relevance scoring to pick the most important chunks first
- [ ] Support multiple PDF uploads in one session
- [ ] Add support for plain text (.txt) and markdown (.md) files
- [ ] Cache parsed chunks in session state to avoid re-parsing on reload

### πŸ… Bonus Quests
- [ ] πŸ”Œ Off the Grid: verify zero external API calls (all local/ZeroGPU)
- [ ] 🎯 Well-Tuned: fine-tune MiniCPM on educational Q&A data using Modal credits ($250)
  - [ ] Find/create educational Q&A dataset on HuggingFace
  - [ ] Write fine-tuning script with LoRA/QLoRA
  - [ ] Run fine-tuning on Modal GPU
  - [ ] Publish fine-tuned model on HuggingFace under build-small-hackathon org
  - [ ] Update model/llm.py to use fine-tuned model
- [ ] 🎨 Off-Brand: fully custom Gradio frontend (see UI section above)
- [ ] πŸ¦™ Llama Champion: switch inference to llama.cpp runtime
  - [ ] Download GGUF version of MiniCPM4-8B
  - [ ] Replace transformers pipeline with llama-cpp-python
  - [ ] Test performance vs transformers
- [ ] πŸ“‘ Sharing is Caring: export and share agent trace on HuggingFace Hub
- [x] πŸ““ Field Notes: write blog post about what we built and learned (BLOG.md)
  - [ ] Document architecture decisions
  - [ ] Include benchmark results (speed, quality)
  - [ ] Publish on HuggingFace blog or personal blog

### πŸ“¦ Submission Checklist (deadline: June 15, 2026)
- [ ] App running and stable on HuggingFace Space
- [ ] Demo video (~2 minutes) showing full flow: upload β†’ question β†’ answer β†’ feedback
- [ ] Social media post (LinkedIn + Twitter) with Space link and demo
- [x] Blog post / Field Notes published (BLOG.md)
- [ ] Submission form filled on HuggingFace
- [ ] All badge requirements verified