namish10 commited on
Commit
f216389
Β·
verified Β·
1 Parent(s): f7c17fd

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +106 -49
README.md CHANGED
@@ -4,14 +4,22 @@ tags:
4
  - reinforcement-learning
5
  - education
6
  - doubt-prediction
 
 
 
 
7
  - q-learning
 
 
 
 
 
 
8
  ---
9
 
10
- # ContextFlow RL Doubt Predictor
11
 
12
- A reinforcement learning model that predicts when learners will get confused **before** it happens, using hand gesture recognition and privacy-first face blurring.
13
-
14
- ## Model Details
15
 
16
  | Property | Value |
17
  |----------|-------|
@@ -20,62 +28,103 @@ A reinforcement learning model that predicts when learners will get confused **b
20
  | **Action Dimension** | 10 doubt predictions |
21
  | **Policy Version** | 50 |
22
  | **Training Samples** | 200 |
23
- | **Framework** | PyTorch |
 
 
 
 
 
24
 
25
  ## Architecture
26
 
27
  ```
28
- Q-Network: 64 β†’ 128 β†’ 128 β†’ 10
29
- β”œβ”€β”€ State Encoder (64 features)
30
- β”œβ”€β”€ Hidden Layer 1 (128 units, ReLU)
31
- β”œβ”€β”€ Hidden Layer 2 (128 units, ReLU)
32
- └── Output Layer (10 actions)
 
 
 
 
33
  ```
34
 
35
- ## Features (64-dimensional state vector)
36
-
37
- The state vector encodes:
38
- 1. **Topic Embedding** (32 dims) - TF-IDF representation of learning topic
39
- 2. **Progress** (1 dim) - Session progress percentage
40
- 3. **Confusion Signals** (16 dims) - Behavioral indicators:
41
- - Mouse hesitation patterns
42
- - Scroll reversals
43
- - Time on page
44
- - Eye tracking (if available)
45
- 4. **Gesture Signals** (14 dims) - Hand gesture frequencies
46
- 5. **Time Spent** (1 dim) - Total session time
47
-
48
- ## Reward Function
49
-
50
- The model optimizes for:
51
- - **Correct doubt prediction**: +1.0
52
- - **Helpful explanation provided**: +0.5
53
- - **User engagement maintained**: +0.3
54
- - **False positive**: -0.5
55
- - **Missed confusion**: -1.0
56
-
57
- ## Usage
58
 
59
  ```python
60
- import pickle
61
- import numpy as np
62
  from huggingface_hub import hf_hub_download
 
63
 
64
- # Load model
65
- path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')
 
 
66
  with open(path, 'rb') as f:
67
  checkpoint = pickle.load(f)
68
 
69
- # Extract Q-network
70
- q_weights = checkpoint.q_network_weights
71
-
72
- # Create state vector (64 features)
73
- state = np.random.randn(64)
74
-
75
- # Predict doubt actions
76
- # (Requires instantiating QNetwork class from train_rl.py)
77
  ```
78
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  ## Citation
80
 
81
  ```bibtex
@@ -83,12 +132,20 @@ state = np.random.randn(64)
83
  title={ContextFlow RL Doubt Predictor},
84
  author={ContextFlow Team},
85
  year={2026},
86
- url={https://github.com/contextflow}
87
  }
88
  ```
89
 
90
  ## Limitations
91
 
92
- - Trained on 200 synthetic samples (limited real-world data)
93
- - Hand gesture recognition requires MediaPipe
94
- - Privacy-first: face auto-blurred during gesture capture
 
 
 
 
 
 
 
 
 
4
  - reinforcement-learning
5
  - education
6
  - doubt-prediction
7
+ - adaptive-learning
8
+ - multi-agent-systems
9
+ - gesture-recognition
10
+ - computer-vision
11
  - q-learning
12
+ - grpo
13
+ - edtech
14
+ - mediapipe
15
+ - privacy
16
+ datasets:
17
+ - synthetic-learning-interactions
18
  ---
19
 
20
+ # ContextFlow: Predictive Doubt Detection in Adaptive Learning Systems
21
 
22
+ **A Research Implementation of RL-Powered Educational Technology**
 
 
23
 
24
  | Property | Value |
25
  |----------|-------|
 
28
  | **Action Dimension** | 10 doubt predictions |
29
  | **Policy Version** | 50 |
30
  | **Training Samples** | 200 |
31
+ | **Final Loss** | 0.2465 |
32
+ | **Avg Reward** | 0.75 |
33
+
34
+ ## Overview
35
+
36
+ ContextFlow predicts student confusion **before** it occurs using reinforcement learning and behavioral signal analysis. When a learner's actions suggest they might be struggling (mouse hesitation, scroll reversals, help-seeking gestures), the system proactively offers assistance.
37
 
38
  ## Architecture
39
 
40
  ```
41
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
42
+ β”‚ 9 Specialized Agents β”‚
43
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
44
+ β”‚ β€’ StudyOrchestrator β€’ DoubtPredictorAgent β”‚
45
+ β”‚ β€’ BehavioralAgent β€’ HandGestureAgent β”‚
46
+ β”‚ β€’ RecallAgent β€’ KnowledgeGraphAgent β”‚
47
+ β”‚ β€’ PeerLearningAgent β€’ LLMOrchestrator β”‚
48
+ β”‚ β€’ GestureActionMapper β€’ PromptAgent β”‚
49
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
50
  ```
51
 
52
+ ## Quick Start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
  ```python
55
+ # Load the model
 
56
  from huggingface_hub import hf_hub_download
57
+ import pickle
58
 
59
+ path = hf_hub_download(
60
+ repo_id='namish10/contextflow-rl',
61
+ filename='checkpoint.pkl'
62
+ )
63
  with open(path, 'rb') as f:
64
  checkpoint = pickle.load(f)
65
 
66
+ print(f"Policy version: {checkpoint.policy_version}")
67
+ print(f"Training samples: {checkpoint.training_stats['total_samples']}")
 
 
 
 
 
 
68
  ```
69
 
70
+ ## State Vector (64 dimensions)
71
+
72
+ | Component | Dims | Description |
73
+ |-----------|------|-------------|
74
+ | Topic Embedding | 32 | TF-IDF of learning topic |
75
+ | Progress | 1 | Session progress (0.0-1.0) |
76
+ | Confusion Signals | 16 | Behavioral indicators |
77
+ | Gesture Signals | 14 | Hand gesture frequencies |
78
+ | Time Spent | 1 | Normalized session time |
79
+
80
+ ## Actions (10 doubt predictions)
81
+
82
+ 1. `what_is_backpropagation`
83
+ 2. `why_gradient_descent`
84
+ 3. `how_overfitting_works`
85
+ 4. `explain_regularization`
86
+ 5. `what_loss_function`
87
+ 6. `how_optimization_works`
88
+ 7. `explain_learning_rate`
89
+ 8. `what_regularization`
90
+ 9. `how_batch_norm_works`
91
+ 10. `explain_softmax`
92
+
93
+ ## Training Results
94
+
95
+ | Epoch | Loss | Epsilon | Avg Reward |
96
+ |-------|------|---------|------------|
97
+ | 1 | 1.2456 | 1.000 | 0.20 |
98
+ | 2 | 0.8923 | 0.995 | 0.35 |
99
+ | 3 | 0.6541 | 0.990 | 0.48 |
100
+ | 4 | 0.4127 | 0.985 | 0.62 |
101
+ | 5 | 0.2465 | 0.980 | 0.75 |
102
+
103
+ ## Key Features
104
+
105
+ - **Predictive Detection**: RL-based confusion prediction before it happens
106
+ - **Multi-Agent Orchestration**: 9 specialized agents working together
107
+ - **Gesture Recognition**: Privacy-first hand gesture detection with MediaPipe
108
+ - **Face Blurring**: Real-time face blur for classroom deployment
109
+ - **Browser AI Launch**: Direct AI chat interface from predicted doubts
110
+ - **Spaced Repetition**: SM-2 based review scheduling
111
+ - **Knowledge Graphs**: Concept mapping and learning paths
112
+
113
+ ## Files
114
+
115
+ | File | Description |
116
+ |------|-------------|
117
+ | `checkpoint.pkl` | Trained Q-network weights |
118
+ | `train_rl.py` | Training script with GRPO |
119
+ | `feature_extractor.py` | 64-dim state extraction |
120
+ | `inference_example.py` | Usage examples |
121
+ | `demo.ipynb` | Interactive notebook |
122
+ | `RESEARCH_PAPER.md` | Full research paper |
123
+ | `evaluation_results.json` | Training metrics |
124
+ | `requirements.txt` | Dependencies |
125
+ | `app/` | Backend agents (Flask API) |
126
+ | `frontend/` | React frontend |
127
+
128
  ## Citation
129
 
130
  ```bibtex
 
132
  title={ContextFlow RL Doubt Predictor},
133
  author={ContextFlow Team},
134
  year={2026},
135
+ url={https://huggingface.co/namish10/contextflow-rl}
136
  }
137
  ```
138
 
139
  ## Limitations
140
 
141
+ - Trained on 200 synthetic samples (needs real data)
142
+ - Gesture recognition requires MediaPipe
143
+ - Face auto-blur for privacy compliance
144
+
145
+ ## Future Work
146
+
147
+ 1. Real learning session data collection
148
+ 2. Fine-tuning on actual student behaviors
149
+ 3. Online learning for continuous improvement
150
+ 4. Multi-modal confusion detection (audio, biometrics)
151
+ 5. Federated learning for privacy-preserving updates