Spaces:
Paused
Paused
File size: 2,185 Bytes
a52f96d c775d45 a52f96d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
---
title: MentorFlow
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.0
app_file: app.py
pinned: false
license: mit
hardware: gpu-t4
---
# MentorFlow - Teacher-Student RL System
A meta-curriculum reinforcement learning system where an AI Teacher Agent learns to select optimal educational tasks to train an AI Student Agent.
## π Features
- **Three Training Strategies**: Compare Random, Progressive, and Teacher-guided curriculum
- **LM Student (DistilBERT)**: Real neural network learning with memory decay
- **GPU Support**: Fast training with CUDA acceleration
- **Interactive Comparison**: Visualize learning curves and performance metrics
## π Usage
1. **Set Parameters**:
- Iterations: Number of training iterations (50-500)
- Seed: Random seed for reproducibility
- Device: Choose GPU (cuda) or CPU
2. **Run Comparison**:
- Click "Run Comparison" to start training
- Monitor progress in the output text
- View generated comparison plots
3. **Analyze Results**:
- Learning curves show how each strategy improves
- Difficult question performance shows final accuracy
- Curriculum diversity shows topic coverage
## β‘ Performance
- **With GPU**: ~5-10 minutes for 500 iterations
- **With CPU**: ~15-30 minutes for 500 iterations
## π Project Structure
```
MentorFlow/
βββ app.py # Gradio web interface
βββ teacher_agent_dev/ # Teacher agent system
β βββ compare_strategies.py # Main comparison script
β βββ teacher_agent.py # UCB bandit teacher
β βββ ...
βββ student_agent_dev/ # LM Student system
β βββ student_agent.py # DistilBERT student
β βββ ...
βββ requirements_hf.txt # Dependencies
```
## π§ Technical Details
- **Teacher Agent**: UCB (Upper Confidence Bound) multi-armed bandit
- **Student Agent**: DistilBERT with online learning
- **Memory Decay**: Ebbinghaus forgetting curve
- **Task Generator**: Procedural generation with 15 topics Γ 7 difficulties
## π More Information
See the main repository for detailed documentation and development guides.
|