ContextFlow RL Doubt Predictor
Overview
This is the trained reinforcement learning model for ContextFlow doubt prediction system.
Model Details
- Algorithm: GRPO (Group Relative Policy Optimization) + Q-Learning
- State Dimension: 64 features
- Action Dimension: 10 doubt prediction actions
- Policy Version: 50
- Training Samples: 200
Usage
import pickle
from huggingface_hub import hf_hub_download
# Download checkpoint
path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')
# Load checkpoint
with open(path, 'rb') as f:
checkpoint = pickle.load(f)
print(f"Policy version: {checkpoint.policy_version}")
Citation
@software{contextflow_rl,
title={ContextFlow RL Doubt Predictor},
author={ContextFlow Team},
year={2026}
}