| # ContextFlow RL Doubt Predictor | |
| ## Overview | |
| This is the trained reinforcement learning model for ContextFlow doubt prediction system. | |
| ## Model Details | |
| - **Algorithm**: GRPO (Group Relative Policy Optimization) + Q-Learning | |
| - **State Dimension**: 64 features | |
| - **Action Dimension**: 10 doubt prediction actions | |
| - **Policy Version**: 50 | |
| - **Training Samples**: 200 | |
| ## Usage | |
| ```python | |
| import pickle | |
| from huggingface_hub import hf_hub_download | |
| # Download checkpoint | |
| path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl') | |
| # Load checkpoint | |
| with open(path, 'rb') as f: | |
| checkpoint = pickle.load(f) | |
| print(f"Policy version: {checkpoint.policy_version}") | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @software{contextflow_rl, | |
| title={ContextFlow RL Doubt Predictor}, | |
| author={ContextFlow Team}, | |
| year={2026} | |
| } | |
| ``` | |