# ContextFlow RL Doubt Predictor ## Overview This is the trained reinforcement learning model for ContextFlow doubt prediction system. ## Model Details - **Algorithm**: GRPO (Group Relative Policy Optimization) + Q-Learning - **State Dimension**: 64 features - **Action Dimension**: 10 doubt prediction actions - **Policy Version**: 50 - **Training Samples**: 200 ## Usage ```python import pickle from huggingface_hub import hf_hub_download # Download checkpoint path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl') # Load checkpoint with open(path, 'rb') as f: checkpoint = pickle.load(f) print(f"Policy version: {checkpoint.policy_version}") ``` ## Citation ```bibtex @software{contextflow_rl, title={ContextFlow RL Doubt Predictor}, author={ContextFlow Team}, year={2026} } ```