File size: 49,091 Bytes

41db8ed

# ContextFlow Architecture: Complete System Overview

## Table of Contents
1. [System Vision](#1-system-vision)
2. [High-Level Architecture](#2-high-level-architecture)
3. [Frontend Layer](#3-frontend-layer)
4. [Backend Layer](#4-backend-layer)
5. [Agent Network](#5-agent-network)
6. [Reinforcement Learning Pipeline](#6-reinforcement-learning-pipeline)
7. [Data Flow](#7-data-flow)
8. [API Design](#8-api-design)
9. [Multi-Modal Detection](#9-multi-modal-detection)
10. [Privacy & Security](#10-privacy--security)
11. [Deployment Architecture](#11-deployment-architecture)

---

## 1. System Vision

**ContextFlow** is an AI-powered learning intelligence engine that predicts when learners will get confused BEFORE it happens, enabling proactive intervention in educational settings.

### Core Problem Solved
- Traditional learning systems are **reactive** - they respond after confusion occurs
- ContextFlow is **proactive** - it predicts confusion and intervenes before disengagement

### Key Innovations
1. **Predictive AI** - RL-based doubt prediction
2. **Gesture Control** - Hands-free learning assistance
3. **Multi-Agent Orchestration** - 9 specialized agents working in concert
4. **Privacy-First** - Face blur for classroom deployment

---

## 2. High-Level Architecture

```
┌─────────────────────────────────────────────────────────────────────┐
│                          USERS                                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐               │
│  │  Students   │  │  Teachers   │  │  Researchers │               │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘               │
└─────────┼─────────────────┼─────────────────┼─────────────────────────┘
          │                 │                 │
          ▼                 ▼                 ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     PRESENTATION LAYER                                │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    React Frontend (Vite)                       │   │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐       │   │
│  │  │  Learn  │ │ LLMFlow │ │Gestures │ │ Predict │ ...     │   │
│  │  │   Tab    │ │  Tab    │ │   Tab   │ │   Tab   │         │   │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘         │   │
│  │                                                              │   │
│  │  ┌─────────────────────────────────────────────────────┐    │   │
│  │  │         MediaPipe Camera Feed (Gesture + Face)       │    │   │
│  │  │    ┌──────────┐              ┌──────────┐          │    │   │
│  │  │    │ Hand     │              │ Face     │          │    │   │
│  │  │    │ Detection │              │ Blur     │          │    │   │
│  │  │    └──────────┘              └──────────┘          │    │   │
│  │  └─────────────────────────────────────────────────────┘    │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘
                                    │
                                    │ REST API (JSON)
                                    │ WebSocket (Optional)
                                    ▼
┌─────────────────────────────────────────────────────────────────────┐
│                       BACKEND LAYER (Flask)                          │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                    API Gateway (Flask Blueprints)               │   │
│  │   /api/session/*  /api/predict/*  /api/gesture/*  /api/*     │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                    │                                  │
│                                    ▼                                  │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │              STUDY ORCHESTRATOR (Central Coordinator)           │   │
│  │   ┌────────────────────────────────────────────────────┐      │   │
│  │   │                   Agent Registry                    │      │   │
│  │   │  DoubtPredictor │ Behavioral │ Gesture │ Recall  │      │   │
│  │   │  KnowledgeGraph │ PeerLearn │ LLMOrch │ Prompt │      │   │
│  │   └────────────────────────────────────────────────────┘      │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                    │                                  │
│    ┌───────────────┬─────────────┼─────────────┬───────────────┐  │
│    ▼               ▼             ▼             ▼               ▼      │
│  ┌─────┐       ┌─────┐      ┌─────┐      ┌─────┐      ┌─────┐   │
│  │ Q-  │       │Behavioral│   │Gesture│    │Recall│     │LLM  │   │
│  │Network│     │Agent   │    │Agent │     │Agent │     │Orch │   │
│  └─────┘       └─────┘      └─────┘      └─────┘      └─────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────┐
│                         DATA LAYER                                    │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐ │
│  │  Checkpoint │  │  Session   │  │  Knowledge │  │   Real     │ │
│  │  (RL Model) │  │  State     │  │  Graph     │  │   Data     │ │
│  │   .pkl      │  │  JSON      │  │  NetworkX  │  │  Collection│ │
│  └────────────┘  └────────────┘  └────────────┘  └────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 3. Frontend Layer

### 3.1 Technology Stack

| Component | Technology | Purpose |
|-----------|------------|---------|
| Framework | React 18 | UI Components |
| Build Tool | Vite | Fast development |
| Styling | Tailwind CSS | Responsive design |
| Icons | Lucide React | Consistent icons |
| Camera | MediaPipe | Hand/Face detection |

### 3.2 Application Structure

```
frontend/src/
├── App.jsx              # Main application (9 tabs)
├── main.jsx             # Entry point
├── index.css            # Global styles
├── BrowserLLMLauncher.js  # AI chat launcher
└── MediaPipeProcessor.js # Camera + gesture processing
```

### 3.3 Tab Interface

| Tab | Purpose |
|-----|---------|
| **Learn** | Dashboard with predictions, reviews, gamification |
| **LLM Flow** | Browser-based AI launcher (no API keys) |
| **Gestures** | Train custom hand gestures |
| **Predict** | RL doubt prediction visualization |
| **Behavior** | Behavioral signal tracking |
| **Peer** | Social learning insights |
| **Stats** | Learning statistics |
| **Gamify** | Fish/XP rewards system |
| **Settings** | AI provider configuration |

### 3.4 BrowserLLMLauncher.js

Opens AI chats directly in browser without API keys:

```javascript
// Opens chat.openai.com with pre-filled context
openAIChat(context, model = 'gpt-4') {
  const url = `https://chat.openai.com/?q=${encodeURIComponent(context)}`;
  window.open(url, '_blank');
}
```

### 3.5 MediaPipeProcessor.js

Handles real-time camera processing:

```
┌─────────────────┐
│   Camera Feed   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐    ┌─────────────────┐
│  Hand Landmark  │    │   Face Mesh     │
│   Detection     │    │   Detection     │
│  (21 points)   │    │  (468 points)  │
└────────┬────────┘    └────────┬────────┘
         │                       │
         ▼                       ▼
┌─────────────────┐    ┌─────────────────┐
│ Gesture         │    │ Face Blur       │
│ Recognition     │───▶│ (Privacy)       │
└────────┬────────┘    └─────────────────┘
         │
         ▼
┌─────────────────┐
│  Backend API    │
│  /api/gesture/  │
└─────────────────┘
```

---

## 4. Backend Layer

### 4.1 Technology Stack

| Component | Technology | Purpose |
|-----------|------------|---------|
| Framework | Flask | REST API |
| Async | asyncio | Non-blocking I/O |
| ML | PyTorch | RL model |
| Data | NumPy | Feature extraction |
| Graphs | NetworkX | Knowledge graphs |
| Storage | JSON/SQLite | Session persistence |

### 4.2 Flask Application Structure

```
backend/
├── run.py                    # Application entry point
├── app/
│   ├── __init__.py          # Flask app factory
│   ├── config.py            # Configuration
│   ├── api/
│   │   ├── __init__.py
│   │   └── main.py          # All API routes (889 lines)
│   └── agents/
│       ├── __init__.py
│       ├── study_orchestrator.py    # Central coordinator
│       ├── doubt_predictor.py       # RL prediction
│       ├── behavioral_agent.py      # Signal processing
│       ├── hand_gesture_agent.py    # MediaPipe integration
│       ├── recall_agent.py          # Spaced repetition
│       ├── knowledge_graph_agent.py # Concept mapping
│       ├── peer_learning_agent.py    # Social learning
│       ├── llm_orchestrator_agent.py # Multi-AI
│       ├── gesture_action_agent.py  # Gesture→Action
│       └── prompt_agent.py          # Prompt templates
```

### 4.3 Flask App Factory

```python
def create_app():
    app = Flask(__name__)
    
    # Load config
    app.config.from_object('app.config.Config')
    
    # Register blueprints
    from app.api.main import api
    app.register_blueprint(api, url_prefix='/api')
    
    # Initialize agents
    init_agents()
    
    return app
```

---

## 5. Agent Network

### 5.1 Agent Overview

```
┌─────────────────────────────────────────────────────────────┐
│                  STUDY ORCHESTRATOR                          │
│              (Central Coordinator)                             │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │   Doubt     │  │ Behavioral  │  │   Hand     │        │
│  │  Predictor  │◀─│   Agent     │─▶│  Gesture   │        │
│  │   Agent    │  │             │  │   Agent    │        │
│  └──────┬──────┘  └─────────────┘  └──────┬──────┘        │
│         │                                  │                │
│         ▼                                  ▼                │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │ Knowledge  │  │   Recall    │  │    LLM      │        │
│  │   Graph    │◀─│   Agent     │─▶│ Orchestrator│        │
│  │   Agent    │  │             │  │             │        │
│  └─────────────┘  └─────────────┘  └─────────────┘        │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐                           │
│  │    Peer    │  │   Gesture  │                           │
│  │  Learning  │  │   Action    │                           │
│  │   Agent    │  │   Mapper    │                           │
│  └─────────────┘  └─────────────┘                           │
└─────────────────────────────────────────────────────────────┘
```

### 5.2 StudyOrchestrator (Central Coordinator)

The orchestrator manages the learning lifecycle:

```python
class StudyOrchestrator:
    def __init__(self, user_id: str):
        self.user_id = user_id
        
        # Initialize all agents
        self.doubt_predictor = DoubtPredictorAgent(user_id)
        self.behavioral_agent = BehavioralAgent(user_id)
        self.gesture_agent = HandGestureAgent(user_id)
        self.recall_agent = RecallAgent(user_id)
        self.knowledge_graph = KnowledgeGraphAgent(user_id)
        self.peer_agent = PeerLearningAgent(user_id)
        
        # State management
        self.state = OrchestratorState()
```

**Session Lifecycle:**
1. **PRE_LEARNING** - Load predictions, check recalls, get peer insights
2. **ACTIVE_LEARNING** - Monitor signals, update predictions, capture doubts
3. **REVIEW** - Trigger spaced repetition, update knowledge graph
4. **POST_LEARNING** - Sync data, update gamification, generate summary

### 5.3 DoubtPredictorAgent (RL Core)

Predicts confusion before it happens:

```python
class DoubtPredictorAgent:
    def __init__(self, user_id: str, config: dict = None):
        self.user_id = user_id
        self.model = self._load_checkpoint()
        self.feature_extractor = FeatureExtractor()
    
    def predict_doubts(self, context: dict, top_k: int = 5):
        # 1. Extract 64-dim state vector
        state = self.feature_extractor.extract_state(context)
        
        # 2. Get Q-values from RL model
        q_values = self.model.predict(state)
        
        # 3. Return top-k predictions
        return self._format_predictions(q_values, top_k)
```

### 5.4 BehavioralAgent

Processes raw behavioral signals:

```python
class BehavioralSignal:
    mouse_hesitation: float      # Pause frequency
    scroll_reversals: int       # Back-and-forth
    time_on_page: float         # Seconds
    eye_tracking: Tuple[float, float]
    click_frequency: int
        
    def calculate_confusion_score(self) -> float:
        # Weighted average of signals
        weights = {
            'hesitation': 0.3,
            'reversals': 0.25,
            'time_on_page': 0.2,
            'tab_switches': 0.15,
            'back_button': 0.1
        }
        return weighted_sum(signals, weights)
```

### 5.5 HandGestureAgent

MediaPipe integration for gesture recognition:

```
Camera Frame
    │
    ▼
┌─────────────────┐
│ MediaPipe Hands │
│  (21 landmarks) │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Gesture Template│
│   Matching      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Confidence    │──▶ Recognized Gesture
│   Score (0-1)   │
└─────────────────┘
```

**Pre-built Gestures:**
| Gesture | Description |
|---------|-------------|
| pinch | Thumb + Index |
| swipe_up | 2-finger up |
| swipe_down | 2-finger down |
| swipe_right | 2-finger right |
| swipe_left | 2-finger left |
| point | Index extended |
| wave | Open palm wave |
| thumbs_up | 👍 confirmation |
| thumbs_down | 👎 rejection |
| fist | Closed hand |

### 5.6 RecallAgent

SM-2 based spaced repetition:

```python
class RecallCard:
    front: str          # Question
    back: str           # Answer
    interval: int       # Days until review
    ease_factor: float  # Difficulty (default 2.5)
    repetitions: int   # Successful reviews

def schedule_review(card: RecallCard, quality: int):
    if quality >= 3:  # Correct
        if card.repetitions == 0:
            card.interval = 1
        elif card.repetitions == 1:
            card.interval = 6
        else:
            card.interval *= card.ease_factor
        card.repetitions += 1
    else:  # Incorrect
        card.repetitions = 0
        card.interval = 1
    
    # Update ease factor
    card.ease_factor += (0.1 - (5 - quality) * (0.08 + (5 - quality) * 0.02))
    card.ease_factor = max(1.3, card.ease_factor)
```

### 5.7 KnowledgeGraphAgent

Concept mapping with NetworkX:

```python
class KnowledgeGraphAgent:
    def __init__(self, user_id: str):
        self.graph = nx.MultiDiGraph()
    
    def add_doubt_to_graph(self, doubt: dict):
        # Create node
        self.graph.add_node(
            doubt['concept'],
            type='concept',
            topic=doubt['topic'],
            timestamp=datetime.now()
        )
        
        # Connect to prerequisites
        for prereq in doubt.get('prerequisites', []):
            self.graph.add_edge(prereq, doubt['concept'], type='prerequisite')
        
        # Connect to related concepts
        for related in doubt.get('related', []):
            self.graph.add_edge(doubt['concept'], related, type='related')
    
    def find_learning_path(self, from_topic: str, to_topic: str):
        try:
            return nx.shortest_path(self.graph, from_topic, to_topic)
        except nx.NetworkXNoPath:
            return []
```

### 5.8 LLMOrchestrator

Multi-provider AI integration:

```python
class LLMOrchestrator:
    SUPPORTED_PROVIDERS = {
        'chatgpt': LLMProvider.CHATGPT,
        'gemini': LLMProvider.GEMINI,
        'claude': LLMProvider.CLAUDE,
        'deepseek': LLMProvider.DEEPSEEK,
        'ollama': LLMProvider.OLLAMA,
        'groq': LLMProvider.GROQ
    }
    
    async def query_parallel(self, request: LLMRequest):
        tasks = []
        for provider in request.providers:
            task = self._query_provider(provider, request)
            tasks.append(task)
        
        # Execute all queries concurrently
        responses = await asyncio.gather(*tasks, return_exceptions=True)
        return [r for r in responses if not isinstance(r, Exception)]
```

### 5.9 GestureActionMapper

Maps gestures to system actions:

```python
class GestureAction(Enum):
    QUERY_MULTI_LLM = "query_multi_llm"
    QUERY_CHATGPT = "query_chatgpt"
    QUERY_GEMINI = "query_gemini"
    TRIGGER_RL_LOOP = "trigger_rl_loop"
    CAPTURE_CONTENT = "capture_content"
    PAUSE_SESSION = "pause_session"
    RESUME_SESSION = "resume_session"

class GestureActionMapper:
    def __init__(self):
        self.action_rules = {
            GestureAction.QUERY_MULTI_LLM: {
                "trigger": {"finger_count": 2, "swipe": "right"}
            },
            GestureAction.PAUSE_SESSION: {
                "trigger": {"gesture": "open_palm"}
            },
            GestureAction.RESUME_SESSION: {
                "trigger": {"gesture": "thumbs_up"}
            }
        }
```

### 5.10 PeerLearningAgent

Social learning insights:

```python
class PeerLearningAgent:
    def get_peer_insights(self, topic: str):
        # Aggregate insights from "similar" students
        insights = []
        
        # Find students who learned this topic
        similar_students = self._find_similar_students(topic)
        
        for student in similar_students:
            # What confused them?
            insights.extend(student.difficult_concepts)
        
        # Return aggregated insights
        return self._aggregate_insights(insights)
```

---

## 6. Reinforcement Learning Pipeline

### 6.1 Problem Formulation

**State Space (64 dimensions):**
```
┌────────────────────────────────────────────────────────────────┐
│  Topic Embedding (32)  │ Progress │ Confusion (16) │ Gesture (14) │ Time │
│  TF-IDF of topic       │ 0.0-1.0 │ Behavioral    │ Hand        │ 0-1 │
│                        │          │ signals       │ signals     │      │
└────────────────────────────────────────────────────────────────┘
```

**Action Space (10 doubt types):**
1. `what_is_backpropagation`
2. `why_gradient_descent`
3. `how_overfitting_works`
4. `explain_regularization`
5. `what_loss_function`
6. `how_optimization_works`
7. `explain_learning_rate`
8. `what_regularization`
9. `how_batch_norm_works`
10. `explain_softmax`

**Reward Function:**
| Event | Reward |
|-------|--------|
| Correct prediction | +1.0 |
| Helpful explanation | +0.5 |
| Engagement maintained | +0.3 |
| False positive | -0.5 |
| Missed confusion | -1.0 |

### 6.2 Q-Network Architecture

```python
class QNetwork(nn.Module):
    def __init__(self, state_dim=64, action_dim=10, hidden_dim=128):
        super().__init__()
        self.fc1 = nn.Linear(state_dim, hidden_dim)  # 64 → 128
        self.fc2 = nn.Linear(hidden_dim, hidden_dim) # 128 → 128
        self.fc3 = nn.Linear(hidden_dim, action_dim) # 128 → 10
    
    def forward(self, x):
        x = F.relu(self.fc1(x))  # ReLU activation
        x = F.relu(self.fc2(x))
        return self.fc3(x)  # Q-values for each action
```

### 6.3 Training Algorithm (GRPO)

```python
class DoubtPredictionRL:
    def train(self, epochs=10, batch_size=32):
        for epoch in range(epochs):
            for batch in self.dataloader:
                # 1. Get current Q-values
                q_values = self.q_network(batch.states)
                
                # 2. Compute targets (GRPO-style)
                with torch.no_grad():
                    next_q = self.target_network(batch.next_states).max(1)[0]
                    targets = batch.rewards + self.gamma * next_q * (~batch.dones)
                
                # 3. Compute loss and update
                loss = self.loss_fn(q_values.gather(1, batch.actions), targets)
                loss.backward()
                self.optimizer.step()
            
            # 4. Update target network
            self.update_target_network()
            
            # 5. Decay epsilon (exploration)
            self.epsilon *= self.epsilon_decay
```

### 6.4 Feature Extraction

```python
class FeatureExtractor:
    STATE_DIM = 64
    
    def extract_state(self, context: dict) -> np.ndarray:
        # Topic embedding (32 dims)
        topic_emb = self._extract_topic_embedding(context['topic'])
        
        # Progress (1 dim)
        progress = np.array([context['progress']])
        
        # Confusion signals (16 dims)
        confusion = self._extract_confusion_signals(context['confusion_signals'])
        
        # Gesture signals (14 dims)
        gestures = self._extract_gesture_signals(context['gesture_signals'])
        
        # Time spent (1 dim)
        time_spent = np.array([context['time_spent'] / 1800])
        
        # Concatenate
        return np.concatenate([topic_emb, progress, confusion, gestures, time_spent])
```

---

## 7. Data Flow

### 7.1 Learning Session Flow

```
┌─────────────────────────────────────────────────────────────────┐
│                        USER STARTS SESSION                         │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                     ORCHESTRATOR.START_SESSION()                   │
│  1. Create new LearningSession                                    │
│  2. Load RL model checkpoint                                      │
│  3. Build learning context                                       │
└─────────────────────────────────────────────────────────────────┘
                                │
                ┌───────────────┼───────────────┐
                ▼               ▼               ▼
        ┌───────────┐   ┌───────────┐   ┌───────────┐
        │   Doubt   │   │ Behavioral│   │   Peer    │
        │ Predictor │   │   Agent   │   │  Learning │
        │           │   │           │   │   Agent   │
        │  Predict  │   │  Analyze  │   │   Get     │
        │  doubts   │   │  signals  │   │  insights │
        └─────┬─────┘   └─────┬─────┘   └─────┬─────┘
              │               │               │
              └───────────────┼───────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                  RETURN INITIAL PREDICTIONS                        │
│  - Top 5 predicted doubts                                         │
│  - Pending reviews                                               │
│  - Peer insights                                                 │
└─────────────────────────────────────────────────────────────────┘
```

### 7.2 Behavioral Signal Flow

```
┌─────────────────────────────────────────────────────────────────┐
│                      REAL-TIME SIGNALS                           │
│                                                                  │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐             │
│  │  Mouse  │  │ Scroll  │  │Gesture  │  │  Time   │             │
│  │Movement │  │ Pattern │  │Camera   │  │  On     │             │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘             │
└───────┼───────────┼───────────┼───────────┼───────────────────────┘
        │           │           │           │
        └───────────┴─────┬─────┴───────────┘
                          ▼
              ┌───────────────────────┐
              │   BEHAVIORAL AGENT     │
              │                       │
              │ calculate_confusion_   │
              │ score(signals)        │
              │                       │
              │ Returns: 0.0 - 1.0   │
              └───────────┬───────────┘
                          │
                          ▼
              ┌───────────────────────┐
              │   DOUBT PREDICTOR     │
              │                       │
              │ If score > 0.5:       │
              │   Re-predict doubts    │
              │   Trigger intervention│
              │                       │
              └───────────────────────┘
```

### 7.3 Gesture-to-Action Flow

```
┌─────────────────────────────────────────────────────────────────┐
│                         CAMERA FRAME                               │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                      MEDIAPIPE PROCESSING                         │
│                                                                  │
│  ┌──────────────────────┐     ┌──────────────────────┐          │
│  │   Hand Landmark      │     │    Face Mesh         │          │
│  │   Detection         │     │    (468 points)      │          │
│  │   (21 points)       │     │                       │          │
│  └──────────┬─────────┘     └──────────┬───────────┘          │
└─────────────┼───────────────────────────┼───────────────────────┘
              │                           │
              ▼                           ▼
┌──────────────────────┐     ┌──────────────────────┐
│  GESTURE TEMPLATE   │     │    FACE BLUR         │
│  MATCHING           │     │    (Privacy)          │
│                     │     │                       │
│  Compare landmarks  │     │  Blur regions with    │
│  to known gestures  │     │  facial keypoints     │
└──────────┬─────────┘     └───────────────────────┘
           │
           ▼
┌──────────────────────┐
│  GESTURE RECOGNIZED  │──▶ Backend /api/gesture/recognize
│                      │
│  {                   │
│    "gesture": "pinch",│
│    "confidence": 0.92│
│  }                   │
└──────────────────────┘
           │
           ▼
┌──────────────────────┐
│ GESTURE ACTION MAPPER │
│                      │
│ pinch ──────────────▶│ TRIGGER_AI_HELP
│ swipe_right ────────▶│ LAUNCH_BROWSER_CHAT
│ open_palm ──────────▶│ PAUSE_SESSION
│ thumbs_up ──────────▶│ MARK_UNDERSTOOD
└──────────────────────┘
```

---

## 8. API Design

### 8.1 API Structure

| Category | Endpoints |
|----------|-----------|
| Session | `/session/start`, `/session/update`, `/session/end`, `/session/insights` |
| Prediction | `/predict/doubts`, `/recommendations` |
| Behavior | `/behavior/track`, `/behavior/heatmap` |
| Graph | `/graph/add`, `/graph/query`, `/graph/path` |
| Review | `/review/due`, `/review/complete`, `/review/stats` |
| Peer | `/peer/insights`, `/peer/doubts`, `/peer/trending` |
| Gesture | `/gesture/list`, `/gesture/recognize`, `/gesture/training/*` |
| LLM | `/llm/query`, `/llm/gesture-action`, `/llm/rl/*` |

### 8.2 Session API

```python
# POST /api/session/start
{
    "user_id": "student123",
    "topic": "Machine Learning",
    "subtopic": "Neural Networks"
}

# Response
{
    "session_id": "session_1699999999.123",
    "topic": "Machine Learning",
    "predictions": [
        {
            "doubt": "how_overfitting_works",
            "confidence": 0.85,
            "explanation": "Student showing signs of confusion...",
            "priority": 1
        }
    ],
    "pending_reviews": 5,
    "peer_insights_count": 3
}
```

### 8.3 Doubt Prediction API

```python
# POST /api/predict/doubts
{
    "context": {
        "topic": "Neural Networks",
        "progress": 0.5,
        "confusion_signals": 0.7
    }
}

# Response
{
    "predictions": [
        {
            "doubt": "how_overfitting_works",
            "confidence": 0.85,
            "explanation": "...",
            "priority": 1,
            "estimated_time": "10 min",
            "prerequisites": ["regularization", "bias-variance"]
        }
    ]
}
```

---

## 9. Multi-Modal Detection

### 9.1 Supported Modalities

```
┌─────────────────────────────────────────────────────────────────┐
│                    MULTI-MODAL FUSION                             │
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
│  │   Audio     │  │  Biometric  │  │ Behavioral  │             │
│  │             │  │             │  │             │             │
│  │ Speech rate │  │ Heart rate  │  │ Mouse moves │             │
│  │ Hesitations │  │ GSR         │  │ Scroll      │             │
│  │ Pauses      │  │ Eye tracking│  │ Key presses │             │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘             │
│         │                │                │                     │
│         └────────────────┼────────────────┘                     │
│                          ▼                                      │
│              ┌─────────────────────────┐                       │
│              │   WEIGHTED FUSION       │                       │
│              │                         │                       │
│              │ audio_weight:    0.2    │                       │
│              │ biometric_weight: 0.3   │                       │
│              │ behavioral_weight: 0.5  │                       │
│              └───────────┬─────────────┘                       │
│                          │                                      │
│                          ▼                                      │
│              ┌─────────────────────────┐                       │
│              │  UNIFIED CONFUSION     │                       │
│              │       SCORE            │                       │
│              │       0.0 - 1.0       │                       │
│              └─────────────────────────┘                       │
└─────────────────────────────────────────────────────────────────┘
```

### 9.2 Feature Extraction by Modality

**Audio (7 features):**
- Speech rate (WPM)
- Pause frequency
- Pause duration
- Pitch variation
- Volume level
- Hesitation count
- Question markers

**Biometric (6 features):**
- Heart rate (BPM)
- Heart rate variability
- Skin conductance (GSR)
- Skin temperature
- Eye blink rate
- Eye open duration

**Behavioral (8 features):**
- Mouse hesitation
- Scroll reversals
- Time on page
- Click frequency
- Back button usage
- Tab switches
- Copy attempts
- Search usage

---

## 10. Privacy & Security

### 10.1 Face Blur Implementation

```python
class FaceBlurProcessor:
    def __init__(self):
        self.face_mesh = mp_face_mesh.FaceMesh(
            static_image_mode=False,
            max_num_faces=1,
            refine_landmarks=True
        )
    
    def blur_face(self, frame):
        # Detect face landmarks
        results = self.face_mesh.process(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
        
        if results.multi_face_landmarks:
            # Get face region
            face_region = self._get_face_region(frame, results)
            
            # Apply Gaussian blur
            blurred = cv2.GaussianBlur(face_region, (51, 51), 0)
            
            # Replace face region
            frame = self._replace_region(frame, blurred, results)
        
        return frame
```

### 10.2 Data Privacy

| Data Type | Storage | Privacy |
|-----------|---------|---------|
| Video frames | None | Processed in-memory only |
| Face images | None | Auto-blurred |
| Hand landmarks | Optional | Anonymized |
| Session data | Local JSON | User-owned |
| Model weights | HuggingFace | Open |

---

## 11. Deployment Architecture

### 11.1 Development Setup

```
┌─────────────────────────────────────────────────────────────────┐
│                      DEVELOPMENT                                  │
│                                                                  │
│  Terminal 1:                 Terminal 2:                        │
│  ┌─────────────────┐         ┌─────────────────┐               │
│  │ cd backend      │         │ cd frontend     │               │
│  │ python run.py   │         │ npm run dev     │               │
│  │                 │         │                 │               │
│  │ Flask :5001     │         │ Vite    :5173   │               │
│  └────────┬────────┘         └────────┬────────┘               │
└───────────┼───────────────────────────┼─────────────────────────┘
            │                           │
            │           ┌───────────────┘
            │           │
            ▼           ▼
┌─────────────────────────────────────────────────────────────────┐
│                    BROWSER (localhost)                            │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │  Frontend (:5173) <─────── Proxy ───────> Backend (:5001)│  │
│  └─────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
```

### 11.2 Production Setup

```
                          ┌─────────────────┐
                          │   Load Balancer  │
                          └────────┬────────┘
                                   │
              ┌────────────────────┼────────────────────┐
              │                    │                    │
              ▼                    ▼                    ▼
      ┌───────────────┐    ┌───────────────┐    ┌───────────────┐
      │  Flask Worker │    │  Flask Worker │    │  Flask Worker │
      │    (:5001)    │    │    (:5001)    │    │    (:5001)    │
      └───────────────┘    └───────────────┘    └───────────────┘
              │                    │                    │
              └────────────────────┼────────────────────┘
                                   │
                                   ▼
                          ┌─────────────────┐
                          │   Redis Cache   │
                          └────────┬────────┘
                                   │
                                   ▼
                          ┌─────────────────┐
                          │   PostgreSQL    │
                          └─────────────────┘
```

### 11.3 HuggingFace Model Hosting

```
┌─────────────────────────────────────────────────────────────────┐
│                     HuggingFace Hub                              │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              namish10/contextflow-rl                     │   │
│  │                                                          │   │
│  │  checkpoint.pkl        ← Trained RL model            │   │
│  │  train_rl.py            ← Training script            │   │
│  │  feature_extractor.py    ← State extraction           │   │
│  │  online_learning.py      ← Continuous learning        │   │
│  │  data_collector.py      ← Real data collection        │   │
│  │  multimodal_detection.py ← Audio/biometric fusion      │   │
│  │  demo.ipynb             ← Interactive demo            │   │
│  │  RESEARCH_PAPER.md      ← Full documentation          │   │
│  │                                                          │   │
│  │  app/ (9 agents + API)                                │   │
│  │  frontend/ (React UI)                                  │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
```

---

## Summary

ContextFlow is a comprehensive system combining:

1. **Predictive AI** - RL-based doubt prediction before confusion occurs
2. **Multi-Agent Architecture** - 9 specialized agents coordinated by orchestrator
3. **Gesture Recognition** - Privacy-first MediaPipe hand detection
4. **Multi-Modal Sensing** - Audio + Biometric + Behavioral fusion
5. **Browser-Based AI** - Direct AI chat launching without API keys
6. **Continuous Learning** - Online learning from user feedback

The system is production-ready with all 9 API endpoints working, complete agent network, and trained RL model available on HuggingFace.