contextflow-rl / README.md
namish10's picture
Upload README.md with huggingface_hub
012f151 verified
|
raw
history blame
830 Bytes

ContextFlow RL Doubt Predictor

Overview

This is the trained reinforcement learning model for ContextFlow doubt prediction system.

Model Details

  • Algorithm: GRPO (Group Relative Policy Optimization) + Q-Learning
  • State Dimension: 64 features
  • Action Dimension: 10 doubt prediction actions
  • Policy Version: 50
  • Training Samples: 200

Usage

import pickle
from huggingface_hub import hf_hub_download

# Download checkpoint
path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')

# Load checkpoint
with open(path, 'rb') as f:
    checkpoint = pickle.load(f)

print(f"Policy version: {checkpoint.policy_version}")

Citation

@software{contextflow_rl,
  title={ContextFlow RL Doubt Predictor},
  author={ContextFlow Team},
  year={2026}
}