namish10
/

contextflow-rl

@@ -4,14 +4,22 @@ tags:
   - reinforcement-learning
   - education
   - doubt-prediction
   - q-learning
 ---
-# ContextFlow RL Doubt Predictor
-A reinforcement learning model that predicts when learners will get confused **before** it happens, using hand gesture recognition and privacy-first face blurring.
-## Model Details
 | Property | Value |
 |----------|-------|
@@ -20,62 +28,103 @@ A reinforcement learning model that predicts when learners will get confused **b
 | **Action Dimension** | 10 doubt predictions |
 | **Policy Version** | 50 |
 | **Training Samples** | 200 |
-| **Framework** | PyTorch |
 ## Architecture
 ```
-Q-Network: 64 → 128 → 128 → 10
-├── State Encoder (64 features)
-├── Hidden Layer 1 (128 units, ReLU)
-├── Hidden Layer 2 (128 units, ReLU)
-└── Output Layer (10 actions)
 ```
-## Features (64-dimensional state vector)
-The state vector encodes:
-1. **Topic Embedding** (32 dims) - TF-IDF representation of learning topic
-2. **Progress** (1 dim) - Session progress percentage
-3. **Confusion Signals** (16 dims) - Behavioral indicators:
-   - Mouse hesitation patterns
-   - Scroll reversals
-   - Time on page
-   - Eye tracking (if available)
-4. **Gesture Signals** (14 dims) - Hand gesture frequencies
-5. **Time Spent** (1 dim) - Total session time
-## Reward Function
-The model optimizes for:
-- **Correct doubt prediction**: +1.0
-- **Helpful explanation provided**: +0.5
-- **User engagement maintained**: +0.3
-- **False positive**: -0.5
-- **Missed confusion**: -1.0
-## Usage
 ```python
-import pickle
-import numpy as np
 from huggingface_hub import hf_hub_download
-# Load model
-path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')
 with open(path, 'rb') as f:
     checkpoint = pickle.load(f)
-# Extract Q-network
-q_weights = checkpoint.q_network_weights
-# Create state vector (64 features)
-state = np.random.randn(64)
-# Predict doubt actions
-# (Requires instantiating QNetwork class from train_rl.py)
 ```
 ## Citation
 ```bibtex
@@ -83,12 +132,20 @@ state = np.random.randn(64)
   title={ContextFlow RL Doubt Predictor},
   author={ContextFlow Team},
   year={2026},
-  url={https://github.com/contextflow}
 }
 ```
 ## Limitations
-- Trained on 200 synthetic samples (limited real-world data)
-- Hand gesture recognition requires MediaPipe
-- Privacy-first: face auto-blurred during gesture capture

   - reinforcement-learning
   - education
   - doubt-prediction
+  - adaptive-learning
+  - multi-agent-systems
+  - gesture-recognition
+  - computer-vision
   - q-learning
+  - grpo
+  - edtech
+  - mediapipe
+  - privacy
+datasets:
+  - synthetic-learning-interactions
 ---
+# ContextFlow: Predictive Doubt Detection in Adaptive Learning Systems
+**A Research Implementation of RL-Powered Educational Technology**
 | Property | Value |
 |----------|-------|
 | **Action Dimension** | 10 doubt predictions |
 | **Policy Version** | 50 |
 | **Training Samples** | 200 |
+| **Final Loss** | 0.2465 |
+| **Avg Reward** | 0.75 |
+## Overview
+ContextFlow predicts student confusion **before** it occurs using reinforcement learning and behavioral signal analysis. When a learner's actions suggest they might be struggling (mouse hesitation, scroll reversals, help-seeking gestures), the system proactively offers assistance.
 ## Architecture
 ```
+┌─────────────────────────────────────────────────┐
+│           9 Specialized Agents                   │
+├─────────────────────────────────────────────────┤
+│ • StudyOrchestrator  • DoubtPredictorAgent      │
+│ • BehavioralAgent    • HandGestureAgent          │
+│ • RecallAgent       • KnowledgeGraphAgent        │
+│ • PeerLearningAgent  • LLMOrchestrator          │
+│ • GestureActionMapper • PromptAgent              │
+└─────────────────────────────────────────────────┘
 ```
+## Quick Start
 ```python
+# Load the model
 from huggingface_hub import hf_hub_download
+import pickle
+path = hf_hub_download(
+    repo_id='namish10/contextflow-rl',
+    filename='checkpoint.pkl'
+)
 with open(path, 'rb') as f:
     checkpoint = pickle.load(f)
+print(f"Policy version: {checkpoint.policy_version}")
+print(f"Training samples: {checkpoint.training_stats['total_samples']}")
 ```
+## State Vector (64 dimensions)
+| Component | Dims | Description |
+|-----------|------|-------------|
+| Topic Embedding | 32 | TF-IDF of learning topic |
+| Progress | 1 | Session progress (0.0-1.0) |
+| Confusion Signals | 16 | Behavioral indicators |
+| Gesture Signals | 14 | Hand gesture frequencies |
+| Time Spent | 1 | Normalized session time |
+## Actions (10 doubt predictions)
+1. `what_is_backpropagation`
+2. `why_gradient_descent`
+3. `how_overfitting_works`
+4. `explain_regularization`
+5. `what_loss_function`
+6. `how_optimization_works`
+7. `explain_learning_rate`
+8. `what_regularization`
+9. `how_batch_norm_works`
+10. `explain_softmax`
+## Training Results
+| Epoch | Loss | Epsilon | Avg Reward |
+|-------|------|---------|------------|
+| 1 | 1.2456 | 1.000 | 0.20 |
+| 2 | 0.8923 | 0.995 | 0.35 |
+| 3 | 0.6541 | 0.990 | 0.48 |
+| 4 | 0.4127 | 0.985 | 0.62 |
+| 5 | 0.2465 | 0.980 | 0.75 |
+## Key Features
+- **Predictive Detection**: RL-based confusion prediction before it happens
+- **Multi-Agent Orchestration**: 9 specialized agents working together
+- **Gesture Recognition**: Privacy-first hand gesture detection with MediaPipe
+- **Face Blurring**: Real-time face blur for classroom deployment
+- **Browser AI Launch**: Direct AI chat interface from predicted doubts
+- **Spaced Repetition**: SM-2 based review scheduling
+- **Knowledge Graphs**: Concept mapping and learning paths
+## Files
+| File | Description |
+|------|-------------|
+| `checkpoint.pkl` | Trained Q-network weights |
+| `train_rl.py` | Training script with GRPO |
+| `feature_extractor.py` | 64-dim state extraction |
+| `inference_example.py` | Usage examples |
+| `demo.ipynb` | Interactive notebook |
+| `RESEARCH_PAPER.md` | Full research paper |
+| `evaluation_results.json` | Training metrics |
+| `requirements.txt` | Dependencies |
+| `app/` | Backend agents (Flask API) |
+| `frontend/` | React frontend |
 ## Citation
 ```bibtex
   title={ContextFlow RL Doubt Predictor},
   author={ContextFlow Team},
   year={2026},
+  url={https://huggingface.co/namish10/contextflow-rl}
 }
 ```
 ## Limitations
+- Trained on 200 synthetic samples (needs real data)
+- Gesture recognition requires MediaPipe
+- Face auto-blur for privacy compliance
+## Future Work
+1. Real learning session data collection
+2. Fine-tuning on actual student behaviors
+3. Online learning for continuous improvement
+4. Multi-modal confusion detection (audio, biometrics)
+5. Federated learning for privacy-preserving updates