# ContextFlow RL Doubt Predictor

## Overview
This is the trained reinforcement learning model for ContextFlow doubt prediction system.

## Model Details
- **Algorithm**: GRPO (Group Relative Policy Optimization) + Q-Learning
- **State Dimension**: 64 features
- **Action Dimension**: 10 doubt prediction actions
- **Policy Version**: 50
- **Training Samples**: 200

## Usage
```python
import pickle
from huggingface_hub import hf_hub_download

# Download checkpoint
path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')

# Load checkpoint
with open(path, 'rb') as f:
    checkpoint = pickle.load(f)

print(f"Policy version: {checkpoint.policy_version}")
```

## Citation
```bibtex
@software{contextflow_rl,
  title={ContextFlow RL Doubt Predictor},
  author={ContextFlow Team},
  year={2026}
}
```